KEVINA NUGRAHA ELEEAS - 2540120585

Video Link:

https://binusianorg-my.sharepoint.com/personal/kevina_eleeas_binus_ac_id/_layouts/15/guestaccess.aspx?guestaccesstoken=V%2FWeqDJa5fp%2BqTijHh6o9TKOCTzsz5mrVOIfSQNUM38%3D&docid=2_070f66b7ea8e44e7089b59b3f7ea3fa7d&rev=1&e=dfoiqo

1. Apply the following exploratory data analysis techniques using R on CarPrice dataset: (25 pts.)
  1. Using mfrow parameter, construct a two-by-two plot array showing the concentrations of the following four attributes versus the record number in the dataset:
  1. carlength, top left;
  2. carwidth, top right;
  3. doornumber, lower left; and
  4. enginetype, lower right.

In all cases, the x-axis label should read Record number in dataset and the y-axis should read the attribute. Each plot should have a title spelling out the name of the element on which the attribute is based (e.g., “carlength” for the top-left plot).

#write your code here
library(readr)
## Warning: package 'readr' was built under R version 4.1.3
library(Hmisc)
## Warning: package 'Hmisc' was built under R version 4.1.3
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.1.3
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:Hmisc':
## 
##     src, summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(caret)
## Warning: package 'caret' was built under R version 4.1.3
## 
## Attaching package: 'caret'
## The following object is masked from 'package:survival':
## 
##     cluster
df <- read.csv("D:\\SOAL UTS\\DTSC6005001_Data Mining and Visualization_REGULER_UTS\\CarPrice-OddSID.csv")
#View(df)
df
##     car_ID symboling                         CarName doornumber drivewheel
## 1        1         3              alfa-romero giulia        two        rwd
## 2        2         3             alfa-romero stelvio        two        rwd
## 3        3         1        alfa-romero Quadrifoglio        two        rwd
## 4        4         2                     audi 100 ls       four        fwd
## 5        5         2                      audi 100ls       four        4wd
## 6        6         2                        audi fox        two        fwd
## 7        7         1                      audi 100ls       four        fwd
## 8        8         1                       audi 5000       four        fwd
## 9        9         1                       audi 4000       four        fwd
## 10      10         0             audi 5000s (diesel)        two        4wd
## 11      11         2                        bmw 320i        two        rwd
## 12      12         0                        bmw 320i       four        rwd
## 13      13         0                          bmw x1        two        rwd
## 14      14         0                          bmw x3       four        rwd
## 15      15         1                          bmw z4       four        rwd
## 16      16         0                          bmw x4       four        rwd
## 17      17         0                          bmw x5        two        rwd
## 18      18         0                          bmw x3       four        rwd
## 19      19         2                chevrolet impala        two        fwd
## 20      20         1           chevrolet monte carlo        two        fwd
## 21      21         0             chevrolet vega 2300       four        fwd
## 22      22         1                   dodge rampage        two        fwd
## 23      23         1             dodge challenger se        two        fwd
## 24      24         1                      dodge d200        two        fwd
## 25      25         1               dodge monaco (sw)       four        fwd
## 26      26         1              dodge colt hardtop       four        fwd
## 27      27         1                 dodge colt (sw)       four        fwd
## 28      28         1            dodge coronet custom        two        fwd
## 29      29        -1               dodge dart custom       four        fwd
## 30      30         3       dodge coronet custom (sw)        two        fwd
## 31      31         2                     honda civic        two        fwd
## 32      32         2                honda civic cvcc        two        fwd
## 33      33         1                     honda civic        two        fwd
## 34      34         1               honda accord cvcc        two        fwd
## 35      35         1                honda civic cvcc        two        fwd
## 36      36         0                 honda accord lx       four        fwd
## 37      37         0             honda civic 1500 gl       four        fwd
## 38      38         0                    honda accord        two        fwd
## 39      39         0                honda civic 1300        two        fwd
## 40      40         0                   honda prelude       four        fwd
## 41      41         0                    honda accord       four        fwd
## 42      42         0                     honda civic       four        fwd
## 43      43         1              honda civic (auto)        two        fwd
## 44      44         0                      isuzu MU-X       four        rwd
## 45      45         1                    isuzu D-Max         two        fwd
## 46      46         0             isuzu D-Max V-Cross       four        fwd
## 47      47         2                    isuzu D-Max         two        rwd
## 48      48         0                       jaguar xj       four        rwd
## 49      49         0                       jaguar xf       four        rwd
## 50      50         0                       jaguar xk        two        rwd
## 51      51         1                       maxda rx3        two        fwd
## 52      52         1                maxda glc deluxe        two        fwd
## 53      53         1                 mazda rx2 coupe        two        fwd
## 54      54         1                      mazda rx-4       four        fwd
## 55      55         1                mazda glc deluxe       four        fwd
## 56      56         3                       mazda 626        two        rwd
## 57      57         3                       mazda glc        two        rwd
## 58      58         3                   mazda rx-7 gs        two        rwd
## 59      59         3                     mazda glc 4        two        rwd
## 60      60         1                       mazda 626        two        fwd
## 61      61         0              mazda glc custom l       four        fwd
## 62      62         1                mazda glc custom        two        fwd
## 63      63         0                      mazda rx-4       four        fwd
## 64      64         0                mazda glc deluxe       four        fwd
## 65      65         0                       mazda 626       four        fwd
## 66      66         0                       mazda glc       four        rwd
## 67      67         0                   mazda rx-7 gs       four        rwd
## 68      68        -1        buick electra 225 custom       four        rwd
## 69      69        -1        buick century luxus (sw)       four        rwd
## 70      70         0                   buick century        two        rwd
## 71      71        -1                   buick skyhawk       four        rwd
## 72      72        -1         buick opel isuzu deluxe       four        rwd
## 73      73         3                   buick skylark        two        rwd
## 74      74         0           buick century special       four        rwd
## 75      75         1 buick regal sport coupe (turbo)        two        rwd
## 76      76         1                  mercury cougar        two        rwd
## 77      77         2               mitsubishi mirage        two        fwd
## 78      78         2               mitsubishi lancer        two        fwd
## 79      79         2            mitsubishi outlander        two        fwd
## 80      80         1                   mitsubishi g4        two        fwd
## 81      81         3            mitsubishi mirage g4        two        fwd
## 82      82         3                   mitsubishi g4        two        fwd
## 83      83         3            mitsubishi outlander        two        fwd
## 84      84         3                   mitsubishi g4        two        fwd
## 85      85         3            mitsubishi mirage g4        two        fwd
## 86      86         1              mitsubishi montero       four        fwd
## 87      87         1               mitsubishi pajero       four        fwd
## 88      88         1            mitsubishi outlander       four        fwd
## 89      89        -1            mitsubishi mirage g4       four        fwd
## 90      90         1                    Nissan versa        two        fwd
## 91      91         1                     nissan gt-r        two        fwd
## 92      92         1                    nissan rogue        two        fwd
## 93      93         1                    nissan latio       four        fwd
## 94      94         1                    nissan titan       four        fwd
## 95      95         1                     nissan leaf        two        fwd
## 96      96         1                     nissan juke        two        fwd
## 97      97         1                    nissan latio       four        fwd
## 98      98         1                     nissan note       four        fwd
## 99      99         2                  nissan clipper        two        fwd
## 100    100         0                    nissan rogue       four        fwd
## 101    101         0                    nissan nv200       four        fwd
## 102    102         0                     nissan dayz       four        fwd
## 103    103         0                     nissan fuga       four        fwd
## 104    104         0                     nissan otti       four        fwd
## 105    105         3                    nissan teana        two        rwd
## 106    106         3                    nissan kicks        two        rwd
## 107    107         1                  nissan clipper        two        rwd
## 108    108         0                     peugeot 504       four        rwd
## 109    109         0                     peugeot 304       four        rwd
## 110    110         0                peugeot 504 (sw)       four        rwd
## 111    111         0                     peugeot 504       four        rwd
## 112    112         0                     peugeot 504       four        rwd
## 113    113         0                   peugeot 604sl       four        rwd
## 114    114         0                     peugeot 504       four        rwd
## 115    115         0       peugeot 505s turbo diesel       four        rwd
## 116    116         0                     peugeot 504       four        rwd
## 117    117         0                     peugeot 504       four        rwd
## 118    118         0                   peugeot 604sl       four        rwd
## 119    119         1               plymouth fury iii        two        fwd
## 120    120         1                plymouth cricket        two        fwd
## 121    121         1               plymouth fury iii       four        fwd
## 122    122         1  plymouth satellite custom (sw)       four        fwd
## 123    123         1        plymouth fury gran sedan       four        fwd
## 124    124        -1                plymouth valiant       four        fwd
## 125    125         3                 plymouth duster        two        rwd
## 126    126         3                   porsche macan        two        rwd
## 127    127         3               porcshce panamera        two        rwd
## 128    128         3                 porsche cayenne        two        rwd
## 129    129         3                  porsche boxter        two        rwd
## 130    130         1                 porsche cayenne        two        rwd
## 131    131         0                    renault 12tl       four        fwd
## 132    132         2                   renault 5 gtl        two        fwd
## 133    133         3                        saab 99e        two        fwd
## 134    134         2                       saab 99le       four        fwd
## 135    135         3                       saab 99le        two        fwd
## 136    136         2                      saab 99gle       four        fwd
## 137    137         3                      saab 99gle        two        fwd
## 138    138         2                        saab 99e       four        fwd
## 139    139         2                          subaru        two        fwd
## 140    140         2                       subaru dl        two        fwd
## 141    141         2                       subaru dl        two        4wd
## 142    142         0                          subaru       four        fwd
## 143    143         0                      subaru brz       four        fwd
## 144    144         0                     subaru baja       four        fwd
## 145    145         0                       subaru r1       four        4wd
## 146    146         0                       subaru r2       four        4wd
## 147    147         0                   subaru trezia       four        fwd
## 148    148         0                  subaru tribeca       four        fwd
## 149    149         0                       subaru dl       four        4wd
## 150    150         0                       subaru dl       four        4wd
## 151    151         1           toyota corona mark ii        two        fwd
## 152    152         1                   toyota corona        two        fwd
## 153    153         1             toyota corolla 1200       four        fwd
## 154    154         0           toyota corona hardtop       four        fwd
## 155    155         0        toyota corolla 1600 (sw)       four        4wd
## 156    156         0                   toyota carina       four        4wd
## 157    157         0                  toyota mark ii       four        fwd
## 158    158         0             toyota corolla 1200       four        fwd
## 159    159         0                   toyota corona       four        fwd
## 160    160         0                  toyota corolla       four        fwd
## 161    161         0                   toyota corona       four        fwd
## 162    162         0                  toyota corolla       four        fwd
## 163    163         0                  toyota mark ii       four        fwd
## 164    164         1         toyota corolla liftback        two        rwd
## 165    165         1                   toyota corona        two        rwd
## 166    166         1       toyota celica gt liftback        two        rwd
## 167    167         1           toyota corolla tercel        two        rwd
## 168    168         2          toyota corona liftback        two        rwd
## 169    169         2                  toyota corolla        two        rwd
## 170    170         2                  toyota starlet        two        rwd
## 171    171         2                   toyota tercel        two        rwd
## 172    172         2                  toyota corolla        two        rwd
## 173    173         2                 toyota cressida        two        rwd
## 174    174        -1                  toyota corolla       four        fwd
## 175    175        -1                toyota celica gt       four        fwd
## 176    176        -1                   toyota corona       four        fwd
## 177    177        -1                  toyota corolla       four        fwd
## 178    178        -1                  toyota mark ii       four        fwd
## 179    179         3         toyota corolla liftback        two        rwd
## 180    180         3                   toyota corona        two        rwd
## 181    181        -1                  toyota starlet       four        rwd
## 182    182        -1                  toyouta tercel       four        rwd
## 183    183         2                vokswagen rabbit        two        fwd
## 184    184         2    volkswagen 1131 deluxe sedan        two        fwd
## 185    185         2            volkswagen model 111       four        fwd
## 186    186         2               volkswagen type 3       four        fwd
## 187    187         2             volkswagen 411 (sw)       four        fwd
## 188    188         2         volkswagen super beetle       four        fwd
## 189    189         2               volkswagen dasher       four        fwd
## 190    190         3                       vw dasher        two        fwd
## 191    191         3                       vw rabbit        two        fwd
## 192    192         0               volkswagen rabbit       four        fwd
## 193    193         0        volkswagen rabbit custom       four        fwd
## 194    194         0               volkswagen dasher       four        fwd
## 195    195        -2                 volvo 145e (sw)       four        rwd
## 196    196        -1                     volvo 144ea       four        rwd
## 197    197        -2                     volvo 244dl       four        rwd
## 198    198        -1                       volvo 245       four        rwd
## 199    199        -2                     volvo 264gl       four        rwd
## 200    200        -1                    volvo diesel       four        rwd
## 201    201        -1                 volvo 145e (sw)       four        rwd
## 202    202        -1                     volvo 144ea       four        rwd
## 203    203        -1                     volvo 244dl       four        rwd
## 204    204        -1                       volvo 246       four        rwd
## 205    205        -1                     volvo 264gl       four        rwd
##     carlength carwidth enginetype enginesize fuelsystem horsepower peakrpm
## 1       168.8     64.1       dohc        130       mpfi        111    5000
## 2       168.8     64.1       dohc        130       mpfi        111    5000
## 3       171.2     65.5       ohcv        152       mpfi        154    5000
## 4       176.6     66.2        ohc        109       mpfi        102    5500
## 5       176.6     66.4        ohc        136       mpfi        115    5500
## 6       177.3     66.3        ohc        136       mpfi        110    5500
## 7       192.7     71.4        ohc        136       mpfi        110    5500
## 8       192.7     71.4        ohc        136       mpfi        110    5500
## 9       192.7     71.4        ohc        131       mpfi        140    5500
## 10      178.2     67.9        ohc        131       mpfi        160    5500
## 11      176.8     64.8        ohc        108       mpfi        101    5800
## 12      176.8     64.8        ohc        108       mpfi        101    5800
## 13      176.8     64.8        ohc        164       mpfi        121    4250
## 14      176.8     64.8        ohc        164       mpfi        121    4250
## 15      189.0     66.9        ohc        164       mpfi        121    4250
## 16      189.0     66.9        ohc        209       mpfi        182    5400
## 17      193.8     67.9        ohc        209       mpfi        182    5400
## 18      197.0     70.9        ohc        209       mpfi        182    5400
## 19      141.1     60.3          l         61       2bbl         48    5100
## 20      155.9     63.6        ohc         90       2bbl         70    5400
## 21      158.8     63.6        ohc         90       2bbl         70    5400
## 22      157.3     63.8        ohc         90       2bbl         68    5500
## 23      157.3     63.8        ohc         90       2bbl         68    5500
## 24      157.3     63.8        ohc         98       mpfi        102    5500
## 25      157.3     63.8        ohc         90       2bbl         68    5500
## 26      157.3     63.8        ohc         90       2bbl         68    5500
## 27      157.3     63.8        ohc         90       2bbl         68    5500
## 28      157.3     63.8        ohc         98       mpfi        102    5500
## 29      174.6     64.6        ohc        122       2bbl         88    5000
## 30      173.2     66.3        ohc        156        mfi        145    5000
## 31      144.6     63.9        ohc         92       1bbl         58    4800
## 32      144.6     63.9        ohc         92       1bbl         76    6000
## 33      150.0     64.0        ohc         79       1bbl         60    5500
## 34      150.0     64.0        ohc         92       1bbl         76    6000
## 35      150.0     64.0        ohc         92       1bbl         76    6000
## 36      163.4     64.0        ohc         92       1bbl         76    6000
## 37      157.1     63.9        ohc         92       1bbl         76    6000
## 38      167.5     65.2        ohc        110       1bbl         86    5800
## 39      167.5     65.2        ohc        110       1bbl         86    5800
## 40      175.4     65.2        ohc        110       1bbl         86    5800
## 41      175.4     62.5        ohc        110       1bbl         86    5800
## 42      175.4     65.2        ohc        110       mpfi        101    5800
## 43      169.1     66.0        ohc        110       2bbl        100    5500
## 44      170.7     61.8        ohc        111       2bbl         78    4800
## 45      155.9     63.6        ohc         90       2bbl         70    5400
## 46      155.9     63.6        ohc         90       2bbl         70    5400
## 47      172.6     65.2        ohc        119       spfi         90    5000
## 48      199.6     69.6       dohc        258       mpfi        176    4750
## 49      199.6     69.6       dohc        258       mpfi        176    4750
## 50      191.7     70.6       ohcv        326       mpfi        262    5000
## 51      159.1     64.2        ohc         91       2bbl         68    5000
## 52      159.1     64.2        ohc         91       2bbl         68    5000
## 53      159.1     64.2        ohc         91       2bbl         68    5000
## 54      166.8     64.2        ohc         91       2bbl         68    5000
## 55      166.8     64.2        ohc         91       2bbl         68    5000
## 56      169.0     65.7      rotor         70       4bbl        101    6000
## 57      169.0     65.7      rotor         70       4bbl        101    6000
## 58      169.0     65.7      rotor         70       4bbl        101    6000
## 59      169.0     65.7      rotor         80       mpfi        135    6000
## 60      177.8     66.5        ohc        122       2bbl         84    4800
## 61      177.8     66.5        ohc        122       2bbl         84    4800
## 62      177.8     66.5        ohc        122       2bbl         84    4800
## 63      177.8     66.5        ohc        122       2bbl         84    4800
## 64      177.8     66.5        ohc        122        idi         64    4650
## 65      177.8     66.5        ohc        122       2bbl         84    4800
## 66      175.0     66.1        ohc        140       mpfi        120    5000
## 67      175.0     66.1        ohc        134        idi         72    4200
## 68      190.9     70.3        ohc        183        idi        123    4350
## 69      190.9     70.3        ohc        183        idi        123    4350
## 70      187.5     70.3        ohc        183        idi        123    4350
## 71      202.6     71.7        ohc        183        idi        123    4350
## 72      202.6     71.7       ohcv        234       mpfi        155    4750
## 73      180.3     70.5       ohcv        234       mpfi        155    4750
## 74      208.1     71.7       ohcv        308       mpfi        184    4500
## 75      199.2     72.0       ohcv        304       mpfi        184    4500
## 76      178.4     68.0        ohc        140       mpfi        175    5000
## 77      157.3     64.4        ohc         92       2bbl         68    5500
## 78      157.3     64.4        ohc         92       2bbl         68    5500
## 79      157.3     64.4        ohc         92       2bbl         68    5500
## 80      157.3     63.8        ohc         98       spdi        102    5500
## 81      173.0     65.4        ohc        110       spdi        116    5500
## 82      173.0     65.4        ohc        122       2bbl         88    5000
## 83      173.2     66.3        ohc        156       spdi        145    5000
## 84      173.2     66.3        ohc        156       spdi        145    5000
## 85      173.2     66.3        ohc        156       spdi        145    5000
## 86      172.4     65.4        ohc        122       2bbl         88    5000
## 87      172.4     65.4        ohc        122       2bbl         88    5000
## 88      172.4     65.4        ohc        110       spdi        116    5500
## 89      172.4     65.4        ohc        110       spdi        116    5500
## 90      165.3     63.8        ohc         97       2bbl         69    5200
## 91      165.3     63.8        ohc        103        idi         55    4800
## 92      165.3     63.8        ohc         97       2bbl         69    5200
## 93      165.3     63.8        ohc         97       2bbl         69    5200
## 94      170.2     63.8        ohc         97       2bbl         69    5200
## 95      165.3     63.8        ohc         97       2bbl         69    5200
## 96      165.6     63.8        ohc         97       2bbl         69    5200
## 97      165.3     63.8        ohc         97       2bbl         69    5200
## 98      170.2     63.8        ohc         97       2bbl         69    5200
## 99      162.4     63.8        ohc         97       2bbl         69    5200
## 100     173.4     65.2        ohc        120       2bbl         97    5200
## 101     173.4     65.2        ohc        120       2bbl         97    5200
## 102     181.7     66.5       ohcv        181       mpfi        152    5200
## 103     184.6     66.5       ohcv        181       mpfi        152    5200
## 104     184.6     66.5       ohcv        181       mpfi        152    5200
## 105     170.7     67.9       ohcv        181       mpfi        160    5200
## 106     170.7     67.9       ohcv        181       mpfi        200    5200
## 107     178.5     67.9       ohcv        181       mpfi        160    5200
## 108     186.7     68.4          l        120       mpfi         97    5000
## 109     186.7     68.4          l        152        idi         95    4150
## 110     198.9     68.4          l        120       mpfi         97    5000
## 111     198.9     68.4          l        152        idi         95    4150
## 112     186.7     68.4          l        120       mpfi         95    5000
## 113     186.7     68.4          l        152        idi         95    4150
## 114     198.9     68.4          l        120       mpfi         95    5000
## 115     198.9     68.4          l        152        idi         95    4150
## 116     186.7     68.4          l        120       mpfi         97    5000
## 117     186.7     68.4          l        152        idi         95    4150
## 118     186.7     68.3          l        134       mpfi        142    5600
## 119     157.3     63.8        ohc         90       2bbl         68    5500
## 120     157.3     63.8        ohc         98       spdi        102    5500
## 121     157.3     63.8        ohc         90       2bbl         68    5500
## 122     167.3     63.8        ohc         90       2bbl         68    5500
## 123     167.3     63.8        ohc         98       2bbl         68    5500
## 124     174.6     64.6        ohc        122       2bbl         88    5000
## 125     173.2     66.3        ohc        156       spdi        145    5000
## 126     168.9     68.3        ohc        151       mpfi        143    5500
## 127     168.9     65.0       ohcf        194       mpfi        207    5900
## 128     168.9     65.0       ohcf        194       mpfi        207    5900
## 129     168.9     65.0       ohcf        194       mpfi        207    5900
## 130     175.7     72.3      dohcv        203       mpfi        288    5750
## 131     181.5     66.5        ohc        132       mpfi         90    5100
## 132     176.8     66.6        ohc        132       mpfi         90    5100
## 133     186.6     66.5        ohc        121       mpfi        110    5250
## 134     186.6     66.5        ohc        121       mpfi        110    5250
## 135     186.6     66.5        ohc        121       mpfi        110    5250
## 136     186.6     66.5        ohc        121       mpfi        110    5250
## 137     186.6     66.5       dohc        121       mpfi        160    5500
## 138     186.6     66.5       dohc        121       mpfi        160    5500
## 139     156.9     63.4       ohcf         97       2bbl         69    4900
## 140     157.9     63.6       ohcf        108       2bbl         73    4400
## 141     157.3     63.8       ohcf        108       2bbl         73    4400
## 142     172.0     65.4       ohcf        108       2bbl         82    4800
## 143     172.0     65.4       ohcf        108       2bbl         82    4400
## 144     172.0     65.4       ohcf        108       mpfi         94    5200
## 145     172.0     65.4       ohcf        108       2bbl         82    4800
## 146     172.0     65.4       ohcf        108       mpfi        111    4800
## 147     173.5     65.4       ohcf        108       2bbl         82    4800
## 148     173.5     65.4       ohcf        108       mpfi         94    5200
## 149     173.6     65.4       ohcf        108       2bbl         82    4800
## 150     173.6     65.4       ohcf        108       mpfi        111    4800
## 151     158.7     63.6        ohc         92       2bbl         62    4800
## 152     158.7     63.6        ohc         92       2bbl         62    4800
## 153     158.7     63.6        ohc         92       2bbl         62    4800
## 154     169.7     63.6        ohc         92       2bbl         62    4800
## 155     169.7     63.6        ohc         92       2bbl         62    4800
## 156     169.7     63.6        ohc         92       2bbl         62    4800
## 157     166.3     64.4        ohc         98       2bbl         70    4800
## 158     166.3     64.4        ohc         98       2bbl         70    4800
## 159     166.3     64.4        ohc        110        idi         56    4500
## 160     166.3     64.4        ohc        110        idi         56    4500
## 161     166.3     64.4        ohc         98       2bbl         70    4800
## 162     166.3     64.4        ohc         98       2bbl         70    4800
## 163     166.3     64.4        ohc         98       2bbl         70    4800
## 164     168.7     64.0        ohc         98       2bbl         70    4800
## 165     168.7     64.0        ohc         98       2bbl         70    4800
## 166     168.7     64.0       dohc         98       mpfi        112    6600
## 167     168.7     64.0       dohc         98       mpfi        112    6600
## 168     176.2     65.6        ohc        146       mpfi        116    4800
## 169     176.2     65.6        ohc        146       mpfi        116    4800
## 170     176.2     65.6        ohc        146       mpfi        116    4800
## 171     176.2     65.6        ohc        146       mpfi        116    4800
## 172     176.2     65.6        ohc        146       mpfi        116    4800
## 173     176.2     65.6        ohc        146       mpfi        116    4800
## 174     175.6     66.5        ohc        122       mpfi         92    4200
## 175     175.6     66.5        ohc        110        idi         73    4500
## 176     175.6     66.5        ohc        122       mpfi         92    4200
## 177     175.6     66.5        ohc        122       mpfi         92    4200
## 178     175.6     66.5        ohc        122       mpfi         92    4200
## 179     183.5     67.7       dohc        171       mpfi        161    5200
## 180     183.5     67.7       dohc        171       mpfi        161    5200
## 181     187.8     66.5       dohc        171       mpfi        156    5200
## 182     187.8     66.5       dohc        161       mpfi        156    5200
## 183     171.7     65.5        ohc         97        idi         52    4800
## 184     171.7     65.5        ohc        109       mpfi         85    5250
## 185     171.7     65.5        ohc         97        idi         52    4800
## 186     171.7     65.5        ohc        109       mpfi         85    5250
## 187     171.7     65.5        ohc        109       mpfi         85    5250
## 188     171.7     65.5        ohc         97        idi         68    4500
## 189     171.7     65.5        ohc        109       mpfi        100    5500
## 190     159.3     64.2        ohc        109       mpfi         90    5500
## 191     165.7     64.0        ohc        109       mpfi         90    5500
## 192     180.2     66.9        ohc        136       mpfi        110    5500
## 193     180.2     66.9        ohc         97        idi         68    4500
## 194     183.1     66.9        ohc        109       mpfi         88    5500
## 195     188.8     67.2        ohc        141       mpfi        114    5400
## 196     188.8     67.2        ohc        141       mpfi        114    5400
## 197     188.8     67.2        ohc        141       mpfi        114    5400
## 198     188.8     67.2        ohc        141       mpfi        114    5400
## 199     188.8     67.2        ohc        130       mpfi        162    5100
## 200     188.8     67.2        ohc        130       mpfi        162    5100
## 201     188.8     68.9        ohc        141       mpfi        114    5400
## 202     188.8     68.8        ohc        141       mpfi        160    5300
## 203     188.8     68.9       ohcv        173       mpfi        134    5500
## 204     188.8     68.9        ohc        145        idi        106    4800
## 205     188.8     68.9        ohc        141       mpfi        114    5400
##        price
## 1   13495.00
## 2   16500.00
## 3   16500.00
## 4   13950.00
## 5   17450.00
## 6   15250.00
## 7   17710.00
## 8   18920.00
## 9   23875.00
## 10  17859.17
## 11  16430.00
## 12  16925.00
## 13  20970.00
## 14  21105.00
## 15  24565.00
## 16  30760.00
## 17  41315.00
## 18  36880.00
## 19   5151.00
## 20   6295.00
## 21   6575.00
## 22   5572.00
## 23   6377.00
## 24   7957.00
## 25   6229.00
## 26   6692.00
## 27   7609.00
## 28   8558.00
## 29   8921.00
## 30  12964.00
## 31   6479.00
## 32   6855.00
## 33   5399.00
## 34   6529.00
## 35   7129.00
## 36   7295.00
## 37   7295.00
## 38   7895.00
## 39   9095.00
## 40   8845.00
## 41  10295.00
## 42  12945.00
## 43  10345.00
## 44   6785.00
## 45   8916.50
## 46   8916.50
## 47  11048.00
## 48  32250.00
## 49  35550.00
## 50  36000.00
## 51   5195.00
## 52   6095.00
## 53   6795.00
## 54   6695.00
## 55   7395.00
## 56  10945.00
## 57  11845.00
## 58  13645.00
## 59  15645.00
## 60   8845.00
## 61   8495.00
## 62  10595.00
## 63  10245.00
## 64  10795.00
## 65  11245.00
## 66  18280.00
## 67  18344.00
## 68  25552.00
## 69  28248.00
## 70  28176.00
## 71  31600.00
## 72  34184.00
## 73  35056.00
## 74  40960.00
## 75  45400.00
## 76  16503.00
## 77   5389.00
## 78   6189.00
## 79   6669.00
## 80   7689.00
## 81   9959.00
## 82   8499.00
## 83  12629.00
## 84  14869.00
## 85  14489.00
## 86   6989.00
## 87   8189.00
## 88   9279.00
## 89   9279.00
## 90   5499.00
## 91   7099.00
## 92   6649.00
## 93   6849.00
## 94   7349.00
## 95   7299.00
## 96   7799.00
## 97   7499.00
## 98   7999.00
## 99   8249.00
## 100  8949.00
## 101  9549.00
## 102 13499.00
## 103 14399.00
## 104 13499.00
## 105 17199.00
## 106 19699.00
## 107 18399.00
## 108 11900.00
## 109 13200.00
## 110 12440.00
## 111 13860.00
## 112 15580.00
## 113 16900.00
## 114 16695.00
## 115 17075.00
## 116 16630.00
## 117 17950.00
## 118 18150.00
## 119  5572.00
## 120  7957.00
## 121  6229.00
## 122  6692.00
## 123  7609.00
## 124  8921.00
## 125 12764.00
## 126 22018.00
## 127 32528.00
## 128 34028.00
## 129 37028.00
## 130 31400.50
## 131  9295.00
## 132  9895.00
## 133 11850.00
## 134 12170.00
## 135 15040.00
## 136 15510.00
## 137 18150.00
## 138 18620.00
## 139  5118.00
## 140  7053.00
## 141  7603.00
## 142  7126.00
## 143  7775.00
## 144  9960.00
## 145  9233.00
## 146 11259.00
## 147  7463.00
## 148 10198.00
## 149  8013.00
## 150 11694.00
## 151  5348.00
## 152  6338.00
## 153  6488.00
## 154  6918.00
## 155  7898.00
## 156  8778.00
## 157  6938.00
## 158  7198.00
## 159  7898.00
## 160  7788.00
## 161  7738.00
## 162  8358.00
## 163  9258.00
## 164  8058.00
## 165  8238.00
## 166  9298.00
## 167  9538.00
## 168  8449.00
## 169  9639.00
## 170  9989.00
## 171 11199.00
## 172 11549.00
## 173 17669.00
## 174  8948.00
## 175 10698.00
## 176  9988.00
## 177 10898.00
## 178 11248.00
## 179 16558.00
## 180 15998.00
## 181 15690.00
## 182 15750.00
## 183  7775.00
## 184  7975.00
## 185  7995.00
## 186  8195.00
## 187  8495.00
## 188  9495.00
## 189  9995.00
## 190 11595.00
## 191  9980.00
## 192 13295.00
## 193 13845.00
## 194 12290.00
## 195 12940.00
## 196 13415.00
## 197 15985.00
## 198 16515.00
## 199 18420.00
## 200 18950.00
## 201 16845.00
## 202 19045.00
## 203 21485.00
## 204 22470.00
## 205 22625.00
par (mfrow = c(2,2), pty="m")
doornum <- as.factor(df$doornumber)
engine <- as.factor(df$enginetype)
plot(df$carlength, main = "Plot Car Length", ylab="Car Length", type = "p", col = "#125B50")
plot(df$carwidth, main = "Plot Car Width", ylab = "Car Width", type = "p", col = "#4D4C7D")
plot(df$car_ID, doornum, main = "Plot Door Number", xlab = "index", ylab = "Door Number", type = "p", col = "#FF6363")
axis(2,
     at = seq(1,2,1),
     labels = levels(factor(df$doornumber)))
plot(df$car_ID, engine, main = "Plot Engine Type", xlab = "index", ylab = "Engine Type", type = "p", col = "#FFD36E")
axis(2,
     at = seq(1,7,1),
     labels = levels(factor(df$enginetype)))

Plot Car Length menggambarkan data panjang mobil dengan range 141.1-208.1 dengan sebaran terbanyak di range 155-180. Plot Car Width menggambarkan data lebar mobil dengan range 60.3-72.3 dengan sebaran terbanyak di range 64-68. Plot Door Number menggambarkan banyaknya jumlah pintu mobil hanya bernilai 2 dan 4. Plot Engine Type menggambarkan tipe mesin yaitu dohcv, 1, ohc, ohcf, ohcv, rotor dengan seabran terbanyak yaitu tipe ohc.
  1. Construct a mosaic plot showing the relationship between the variables FuelSystem and DriveWheel in the CarPrice data frame. Does this plot suggest a relationship between these variables? Explain your answer.
#write your code here
fuel <- as.factor(df$fuelsystem)
wheel <- as.factor(df$drivewheel)
mosaic_df <- table(fuel, wheel)
mosaicplot(mosaic_df, main = "Car Price Mosaic Plot", sub = "Relationship Between Drive Wheel and Fuel System", shade = TRUE, xlab = "Fuel System", ylab = "Drive Wheel", border = "#A97155")

Drive wheel adalah penggerak roda dan Fuel system adalah penghasil energi atau penyuplai bahan bakar bertekanan tinggi ke dalam silinder. 
Pada fuel system 1bbl, semuanya menggunakan sistem drive wheel fwd. 
Pada fuel system 2bbl, mayoritas menggunakan sistem drive wheel fwd, sekitar 10% 4wd, dan sekitar 5% rwd.
Pada fuel system 4bbl, semuanya menggunakan sistem drive wheel fwd.
Pada fuel system idi, menggunakan 50% rwd dan 50% fwd.
Pada fuel system mfi, semuanya menggunakan sistem drive wheel fwd.
Pada fuel system mpfi, sekitar 50% rwd, sekitar 45% fwd, dan 5% 4wd.
Pada fuel system spdi, sekitar 15% rwd dan 85% fwd.
Pada fuel system spfi, semuanya menggunakan sistem drive wheel rwd.
  1. Compute the correlation for all attributes. Interpret the statistical findings!
#write your code here
numeric_df <- data.frame(df$symboling, as.numeric(as.factor(df$doornumber)), as.numeric(as.factor(df$drivewheel)), df$carlength, df$carwidth, as.numeric(as.factor(df$enginetype)), df$enginesize, as.numeric(as.factor(df$fuelsystem)), df$horsepower, df$peakrpm, df$price)
colnames(numeric_df) <- c( 'symboling',  'doornumber', 'drivewheel', 'carlength', 'carwidth', 'enginetype', 'enginesize', 'fuelsystem', 'horsepower', 'peakrpm', 'price')


tableCorr <- rcorr(as.matrix(numeric_df), type = "pearson")
tableCorr
##            symboling doornumber drivewheel carlength carwidth enginetype
## symboling       1.00       0.66      -0.04     -0.36    -0.23       0.05
## doornumber      0.66       1.00       0.10     -0.40    -0.21       0.06
## drivewheel     -0.04       0.10       1.00      0.49     0.47      -0.12
## carlength      -0.36      -0.40       0.49      1.00     0.84      -0.11
## carwidth       -0.23      -0.21       0.47      0.84     1.00       0.01
## enginetype      0.05       0.06      -0.12     -0.11     0.01       1.00
## enginesize     -0.11      -0.02       0.52      0.68     0.74       0.04
## fuelsystem      0.09       0.02       0.42      0.56     0.52      -0.09
## horsepower      0.07       0.13       0.52      0.55     0.64       0.01
## peakrpm         0.27       0.25      -0.04     -0.29    -0.22       0.01
## price          -0.08      -0.03       0.58      0.68     0.76       0.05
##            enginesize fuelsystem horsepower peakrpm price
## symboling       -0.11       0.09       0.07    0.27 -0.08
## doornumber      -0.02       0.02       0.13    0.25 -0.03
## drivewheel       0.52       0.42       0.52   -0.04  0.58
## carlength        0.68       0.56       0.55   -0.29  0.68
## carwidth         0.74       0.52       0.64   -0.22  0.76
## enginetype       0.04      -0.09       0.01    0.01  0.05
## enginesize       1.00       0.51       0.81   -0.24  0.87
## fuelsystem       0.51       1.00       0.66    0.01  0.53
## horsepower       0.81       0.66       1.00    0.13  0.81
## peakrpm         -0.24       0.01       0.13    1.00 -0.09
## price            0.87       0.53       0.81   -0.09  1.00
## 
## n= 205 
## 
## 
## P
##            symboling doornumber drivewheel carlength carwidth enginetype
## symboling            0.0000     0.5530     0.0000    0.0008   0.4732    
## doornumber 0.0000               0.1581     0.0000    0.0029   0.3739    
## drivewheel 0.5530    0.1581                0.0000    0.0000   0.0953    
## carlength  0.0000    0.0000     0.0000               0.0000   0.1058    
## carwidth   0.0008    0.0029     0.0000     0.0000             0.8611    
## enginetype 0.4732    0.3739     0.0953     0.1058    0.8611             
## enginesize 0.1311    0.7678     0.0000     0.0000    0.0000   0.5617    
## fuelsystem 0.1936    0.8252     0.0000     0.0000    0.0000   0.1906    
## horsepower 0.3126    0.0697     0.0000     0.0000    0.0000   0.8835    
## peakrpm    0.0000    0.0003     0.5747     0.0000    0.0015   0.9365    
## price      0.2543    0.6504     0.0000     0.0000    0.0000   0.4838    
##            enginesize fuelsystem horsepower peakrpm price 
## symboling  0.1311     0.1936     0.3126     0.0000  0.2543
## doornumber 0.7678     0.8252     0.0697     0.0003  0.6504
## drivewheel 0.0000     0.0000     0.0000     0.5747  0.0000
## carlength  0.0000     0.0000     0.0000     0.0000  0.0000
## carwidth   0.0000     0.0000     0.0000     0.0015  0.0000
## enginetype 0.5617     0.1906     0.8835     0.9365  0.4838
## enginesize            0.0000     0.0000     0.0004  0.0000
## fuelsystem 0.0000                0.0000     0.8392  0.0000
## horsepower 0.0000     0.0000                0.0610  0.0000
## peakrpm    0.0004     0.8392     0.0610             0.2241
## price      0.0000     0.0000     0.0000     0.2241
Berdasarkan tabel korelasi diatas:
Price(harga) memiliki korelasi yang tinggi dengan enginesize yaitu 0.87 dan housepower yaitu 0.81
Horsepower memiliki korelasi tertinggi dengan enginesize dan price yaitu 0.81
Enginesize memiliki korelasi yang tinggi dengan price yaitu 0.87, horsepower yaitu 0.81, dan carwidth 0.74
Carwidth memiliki korelasi yang tinggi dengan carlength yaitu 0.84 dan price 0.76
2. You need to compare three ways (three-sigma edit rule, Hampel identifier, boxplot rule) of detecting univariate outliers for the peakrpm attribute from the data frame: (20 pts.)
  1. Generate a plot for each technique and give the appropriate features (labels, line type, etc.). Based on these plots, which outlier detector seems to be giving the more reasonable results?
#write your code here
lower_bound <- boxplot(df$peakrpm)$stats[1]
upper_bound <- boxplot(df$peakrpm)$stats[5]

threeSigmaRule <- function(x,t=3){
  lb = mean(x) - t * sd(x)
  ub = mean(x) + t * sd(x)
  
  return(c(lb, ub))
}

hampelIdentifier <- function(x, t=3){
  lb = median(x) - t * mad(x)
  ub = median(x) + t * mad(x)
  
  return(c(lb,ub))
}

threesigma <- threeSigmaRule(df$peakrpm)
hampel <- hampelIdentifier(df$peakrpm)

lower_bound
## [1] 4150
upper_bound
## [1] 6000
threesigma[1]
## [1] 3694.165
threesigma[2]
## [1] 6556.079
hampel[1]
## [1] 3865.66
hampel[2]
## [1] 6534.34
plot(df$car_ID, df$peakrpm, 
     main = "BoxPlot", 
     type = "p", 
     col = "#FD5D5D", 
     col.main = "#FFBBBB",
     col.lab = "#890F0D",
     fg = "#F68989",
     xlab = "Index",
     ylab = "peakrpm")
abline(h = lower_bound, col = "#D82148", lty = 3)
abline(h = upper_bound, col = "#2D31FA", lty = 3)
legend("topright",lty=c(3,3),
       legend = c("min","max"),
       col = c("#D82148", "#2D31FA"),
       lwd=3,
       title="Labels", 
       title.col = "#614124",
       text.font = 3,
       text.col = c("#D82148", "#2D31FA"),
       bg = "#F7E2E2",
       box.col = "#FFBED8")

plot(df$car_ID, df$peakrpm,
     main = "Three Sigma Rule", 
     type = "p", 
     col = "#333C83", 
     col.main = "#8FBDD3",
     col.lab = "#22577E",
     fg = "#22577E",
     xlab = "Index",
     ylab = "peakrpm")
abline(h = threesigma[1], col = "#D82148", lty = 10)
abline(h = threesigma[2], col = "#2D31FA", lty = 10)
legend("topright",lty=10,
       legend = "max",
       col = "#2D31FA",
       lwd=3,
       title="Labels",
       title.col = "#614124",
       text.font = 3,
       text.col = "#2D31FA",
       bg = "#FFF6EA",
       box.col = "#A97155")

plot(df$car_ID, df$peakrpm,
     main = "Hample Identifier", 
     type = "p", 
     col = "#247881", 
     col.main = "#99FFCD",
     col.lab = "#006778",
     fg = "#006778",
     xlab = "Index",
     ylab = "peakrpm")
abline(h = hampel[1], col = "#D82148", lty = 4)
abline(h = hampel[2], col = "#2D31FA", lty = 4)
legend("topright",lty=4,
       legend = "max",
       col = "#2D31FA",
       lwd=3,
       title="Labels", 
       title.col = "#614124",
       text.font = 3,
       text.col = "#2D31FA",
       bg = "#FFF6EA",
       box.col = "#A97155")

Batas outliers dengan menggunakan boxplot rule memiliki lower bound 4150 dan upper bound 6000. 
Batas outliers dengan menggunakan three sigma rule memiliki lower bound 3694.165 dan upper bound 6556.079 dan lower bound nya tidak masuk dalam plot karena terlalu rendah. 
Batas outliers menggunakan hample identifier memiliki lower bound 3865.66 dan upper bound 6534.34 dan lower bound nya tidak masuk dalam plot karena terlalu rendah. 
Berdasarkan asumsi saya, untuk mobil sedan idealnya memiliki rpm terendah sekitar 4000 dan 6000 sehingga batas atas dan batas bawah outliers dari plot akan lebih akurat dengan menggunakan boxplot rule.
  1. How many data points are declared outliers by each of the technique? Based on this data points, which outlier detector seems to be giving the more reasonable results?
#write your code here
filter(df, df$peakrpm<lower_bound | df$peakrpm>upper_bound)
##   car_ID symboling                   CarName doornumber drivewheel carlength
## 1    166         1 toyota celica gt liftback        two        rwd     168.7
## 2    167         1     toyota corolla tercel        two        rwd     168.7
##   carwidth enginetype enginesize fuelsystem horsepower peakrpm price
## 1       64       dohc         98       mpfi        112    6600  9298
## 2       64       dohc         98       mpfi        112    6600  9538
filter(df, df$peakrpm<threesigma[1] | df$peakrpm>threesigma[2])
##   car_ID symboling                   CarName doornumber drivewheel carlength
## 1    166         1 toyota celica gt liftback        two        rwd     168.7
## 2    167         1     toyota corolla tercel        two        rwd     168.7
##   carwidth enginetype enginesize fuelsystem horsepower peakrpm price
## 1       64       dohc         98       mpfi        112    6600  9298
## 2       64       dohc         98       mpfi        112    6600  9538
filter(df, df$peakrpm<hampel[1] | df$peakrpm>hampel[2])
##   car_ID symboling                   CarName doornumber drivewheel carlength
## 1    166         1 toyota celica gt liftback        two        rwd     168.7
## 2    167         1     toyota corolla tercel        two        rwd     168.7
##   carwidth enginetype enginesize fuelsystem horsepower peakrpm price
## 1       64       dohc         98       mpfi        112    6600  9298
## 2       64       dohc         98       mpfi        112    6600  9538
Berdasarkan data point, asumsi saya ketiganya mendapatkan outliers yang tepat sejumlah 2 dengan nilai peakrpm 6600. Jadi, baik menggunakan boxplot rule atau three sigma rule atau hample identifier akan menghasilkan outliers yang sama, sehingga ketiganya tepat.
3. Do a comprehensive EDA on your dataset then find the best-fit linear regression model then answer the following questions: (40 pts.)
  1. Interpret the result of your model.
#write your code here
numeric_only <- unlist(lapply(df, is.numeric))
numeric_only_df <- df [ , numeric_only]

rcorr(as.matrix(numeric_only_df), type = "pearson")
##            car_ID symboling carlength carwidth enginesize horsepower peakrpm
## car_ID       1.00     -0.15      0.17     0.05      -0.03      -0.02   -0.20
## symboling   -0.15      1.00     -0.36    -0.23      -0.11       0.07    0.27
## carlength    0.17     -0.36      1.00     0.84       0.68       0.55   -0.29
## carwidth     0.05     -0.23      0.84     1.00       0.74       0.64   -0.22
## enginesize  -0.03     -0.11      0.68     0.74       1.00       0.81   -0.24
## horsepower  -0.02      0.07      0.55     0.64       0.81       1.00    0.13
## peakrpm     -0.20      0.27     -0.29    -0.22      -0.24       0.13    1.00
## price       -0.11     -0.08      0.68     0.76       0.87       0.81   -0.09
##            price
## car_ID     -0.11
## symboling  -0.08
## carlength   0.68
## carwidth    0.76
## enginesize  0.87
## horsepower  0.81
## peakrpm    -0.09
## price       1.00
## 
## n= 205 
## 
## 
## P
##            car_ID symboling carlength carwidth enginesize horsepower peakrpm
## car_ID            0.0300    0.0144    0.4557   0.6291     0.8309     0.0034 
## symboling  0.0300           0.0000    0.0008   0.1311     0.3126     0.0000 
## carlength  0.0144 0.0000              0.0000   0.0000     0.0000     0.0000 
## carwidth   0.4557 0.0008    0.0000             0.0000     0.0000     0.0015 
## enginesize 0.6291 0.1311    0.0000    0.0000              0.0000     0.0004 
## horsepower 0.8309 0.3126    0.0000    0.0000   0.0000                0.0610 
## peakrpm    0.0034 0.0000    0.0000    0.0015   0.0004     0.0610            
## price      0.1195 0.2543    0.0000    0.0000   0.0000     0.0000     0.2241 
##            price 
## car_ID     0.1195
## symboling  0.2543
## carlength  0.0000
## carwidth   0.0000
## enginesize 0.0000
## horsepower 0.0000
## peakrpm    0.2241
## price
result <- numeric_only_df[c()] #untuk mengambil total baris
output <- hist.data.frame(result)
output
## [1] 0
# modelling
input_multi <- numeric_only_df[, c("horsepower", "enginesize", "price")]
multi_model <- lm(log(price)~enginesize+horsepower, data = input_multi)
summary(multi_model)
## 
## Call:
## lm(formula = log(price) ~ enginesize + horsepower, data = input_multi)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.89435 -0.16719 -0.03681  0.17873  0.60249 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8.0405210  0.0559679 143.663  < 2e-16 ***
## enginesize  0.0057368  0.0007116   8.062 6.50e-14 ***
## horsepower  0.0056294  0.0007494   7.512 1.84e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2483 on 202 degrees of freedom
## Multiple R-squared:  0.7594, Adjusted R-squared:  0.757 
## F-statistic: 318.8 on 2 and 202 DF,  p-value: < 2.2e-16
input <- numeric_only_df[, c("price", "horsepower")]
model <- lm(log(price)~horsepower, data = input)
summary(model)
## 
## Call:
## lm(formula = log(price) ~ horsepower, data = input)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.93481 -0.18692 -0.06027  0.18024  0.80756 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8.2592216  0.0561426  147.11   <2e-16 ***
## horsepower  0.0105214  0.0005042   20.87   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2848 on 203 degrees of freedom
## Multiple R-squared:  0.682,  Adjusted R-squared:  0.6804 
## F-statistic: 435.4 on 1 and 203 DF,  p-value: < 2.2e-16
# validation set
set.seed(1)
validx = createDataPartition(df$price, p=0.8, list = FALSE)
valset = df[-validx,]
trainingset = df[validx,]
#write your code here
model <- lm(log(price)~horsepower, data = input) 
#dilog untuk meningkatkan r square dan f statistics
summary(model)
## 
## Call:
## lm(formula = log(price) ~ horsepower, data = input)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.93481 -0.18692 -0.06027  0.18024  0.80756 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8.2592216  0.0561426  147.11   <2e-16 ***
## horsepower  0.0105214  0.0005042   20.87   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2848 on 203 degrees of freedom
## Multiple R-squared:  0.682,  Adjusted R-squared:  0.6804 
## F-statistic: 435.4 on 1 and 203 DF,  p-value: < 2.2e-16
plot(model, which = 1)

valset$predicted <- predict(model, valset)
actual_prediction <- data.frame(valset$price, valset$predicted, valset$price - valset$predicted)
names(actual_prediction) <-c("price", "predicted", "residual")
correlation_accuracy <- cor(actual_prediction)
correlation_accuracy
##               price predicted  residual
## price     1.0000000 0.8150098 1.0000000
## predicted 0.8150098 1.0000000 0.8149944
## residual  1.0000000 0.8149944 1.0000000
head(actual_prediction)
##   price predicted  residual
## 1 13950  9.332405 13940.668
## 2 23875  9.732218 23865.268
## 3  6377  8.974677  6368.025
## 4  8558  9.332405  8548.668
## 5  5399  8.890506  5390.109
## 6  7129  9.058848  7119.941
#RESULT
prediction <- predict(model, valset)
plot(exp(prediction), valset$price, main="actual vs predicted price", xlab="Predicted Price", ylab="Actual Price", col="blue")
abline(a=0, b=1)

par(mfrow=c(2,2))
plot(model)

hist(rstudent(model))

Berdasarkan summary dari model yang variable dependent nya price dan independent nya horsepower, menunjukkan snip codes yang cukup signifikan. Dilihat dari Residualsnya tampak cenderung tersitribusi normal karena Q1 dan Q3 memiliki koefisien yang hampir sama dimana Q1 terletak di bagian kiri dan Q3 di bagian kanan, meski Min dan Max koefisiennya berbeda 1 nilai. Dilihat dari coefficients nya terlihat bahwa standard error keduanya mendekati nilai nol dan Pr(>|t|) nya memiliki nilai yang kurang dari 0.05. F-statisticnya memiliki besar yang ideal yaitu sekitar 400.
Berdasarkan Visualisasi, apabila garis merah mendekati garis putus-putus maka model tersebut bagus.
  1. Write down the equation of the best fitting line.
#log(price) = 8.2592216 + Horsepower(0.0105214)
log(price) = 8.2592216 + Horsepower(0.0105214)
  1. Is your model good? Why or why not?
# Code nya di 3A
Jadi, berdasarkan predictednya memiliki persentase keakuratan yang cukup tinggi yaitu 81%.
Smakin tinggi predicted price nya, data point nya semakin jauh dari garis regresi yang menunjukkan ke kurang akuratan. Sementara, Pada predicted price yang rendah, data pointnya mendekati garis sehingga lebih akurat.
Jadi, menurut saya model tersebut bagus karena memiliki persentase keakuratan yang tinggi yaitu 81% dan banyak data point yang mendekati garis regresi.
  1. Based on your answer in c, will you deploy the model? Why or why not?
# Code nya di 3A
Meneurut saya, meski model tersebut bagus, tidak dapat digunakan dalam jumlah data yang besar dikarenakan predicted price pada harga yang tinggi memiliki jarak yang cukup jauh dari garis regresi yang berarti kurang akurat, sehingga apabila model ini digunakan untuk data yang memiliki banyak predicted price yang tinggi tentu akan tidak akurat. Jadi, saya tidak akan menggunakan model ini untuk kasus lain.