Summary

The data set under consideration is the nuclear data from the boot package. It contains 12 variables about 32 plants.

Variable Definition
cost The capital cost of construction in millions of dollars adjusted to 1976 base.
date The date on which the construction permit was issued. The data are measured in years since January 1 1990 to the nearest month.
t1 The time between application for and issue of the construction permit.
t2 The time between issue of operating license and construction permit.
cap The net capacity of the power plant (MWe).
pr A binary variable where 1 indicates the prior existence of a LWR plant at the same site.
ne A binary variable where 1 indicates that the plant was constructed in the north-east region of the U.S.A.
ct A binary variable where 1 indicates the use of a cooling tower in the plant.
bw A binary variable where 1 indicates that the nuclear steam supply system was manufactured by Babcock-Wilcox.
cum.n The cumulative number of power plants constructed by each architect-engineer.
pt A binary variable where 1 indicates those plants with partial turnkey guarantees.

Question 1

summary(Nuc)
##        X              cost            date             t1       
##  Min.   : 1.00   Min.   :207.5   Min.   :67.17   Min.   : 7.00  
##  1st Qu.: 8.75   1st Qu.:310.3   1st Qu.:67.90   1st Qu.:11.75  
##  Median :16.50   Median :448.1   Median :68.42   Median :13.00  
##  Mean   :16.50   Mean   :461.6   Mean   :68.58   Mean   :13.75  
##  3rd Qu.:24.25   3rd Qu.:612.0   3rd Qu.:68.92   3rd Qu.:15.25  
##  Max.   :32.00   Max.   :881.2   Max.   :71.08   Max.   :22.00  
##        t2             cap               pr               ne      
##  Min.   :44.00   Min.   : 457.0   Min.   :0.0000   Min.   :0.00  
##  1st Qu.:56.50   1st Qu.: 745.0   1st Qu.:0.0000   1st Qu.:0.00  
##  Median :62.50   Median : 822.0   Median :0.0000   Median :0.00  
##  Mean   :62.38   Mean   : 825.4   Mean   :0.3125   Mean   :0.25  
##  3rd Qu.:70.25   3rd Qu.: 947.2   3rd Qu.:1.0000   3rd Qu.:0.25  
##  Max.   :85.00   Max.   :1130.0   Max.   :1.0000   Max.   :1.00  
##        ct               bw             cum.n              pt        
##  Min.   :0.0000   Min.   :0.0000   Min.   : 1.000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.: 3.000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000   Median : 7.500   Median :0.0000  
##  Mean   :0.4062   Mean   :0.1875   Mean   : 8.531   Mean   :0.1875  
##  3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:12.500   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :21.000   Max.   :1.0000

The mean and median cost of a plant is 461.5603125 and 448.105 respectively. Similarly for capacity, the mean and median values are 825.375 and 822.

Question 2

We will create a subset for all plants not in the NorthEast (rows) and disregard all binary indicator variables

colsub <- c('X', 'cost', 'date', 't1', 't2', 'cap', 'cum.n')
NucSub <- Nuc[Nuc$ne == 0, colsub]

Question 3

names(NucSub) <- c('IDX', 'CapCost', 'ConstDate', 'App2Issue', 'Op2Issue', 'Capacity', 'NumConstructed')

Question 4

summary(NucSub)
##       IDX           CapCost        ConstDate       App2Issue    
##  Min.   : 2.00   Min.   :207.5   Min.   :67.17   Min.   : 7.00  
##  1st Qu.:12.50   1st Qu.:287.6   1st Qu.:67.83   1st Qu.:11.75  
##  Median :19.50   Median :417.8   Median :68.42   Median :13.00  
##  Mean   :18.79   Mean   :425.0   Mean   :68.53   Mean   :13.58  
##  3rd Qu.:26.25   3rd Qu.:492.1   3rd Qu.:68.92   3rd Qu.:15.25  
##  Max.   :32.00   Max.   :881.2   Max.   :71.08   Max.   :20.00  
##     Op2Issue        Capacity      NumConstructed  
##  Min.   :44.00   Min.   : 457.0   Min.   : 1.000  
##  1st Qu.:57.75   1st Qu.: 745.0   1st Qu.: 2.750  
##  Median :63.00   Median : 822.0   Median : 7.000  
##  Mean   :63.29   Mean   : 826.1   Mean   : 7.792  
##  3rd Qu.:70.25   3rd Qu.: 947.2   3rd Qu.:11.000  
##  Max.   :85.00   Max.   :1130.0   Max.   :21.000

The mean and median cost of a plant in this subset is 424.9541667 and 417.75 respectively. Similarly for capacity, the new mean and median values are 826.0833333 and 822. The mean and median costs are lower, but the mean and median capacity is pretty much unchanged.

Question 5

Since this data set has no character values, renaming will be performed by re-indexing starting from 1.

NucSub
##    IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2    2  452.99     67.33        10       73     1065              1
## 3    3  443.22     67.33        10       85     1065              1
## 7    7  272.37     68.17        12       50      822              5
## 8    8  317.21     68.42        14       59      457              1
## 9    9  457.12     68.42        15       55      822              5
## 11  11  350.63     68.58        12       64      560              3
## 13  13  412.18     68.42        15       62      530              2
## 14  14  495.58     68.92        17       52     1050              7
## 15  15  394.36     68.92        13       65      850             16
## 16  16  423.32     68.42        11       67      778              3
## 18  18  289.66     68.42        15       76      530              2
## 19  19  881.24     69.17        15       67     1090              1
## 20  20  490.88     68.92        16       59     1050              8
## 21  21  567.79     68.75        11       70      913             15
## 23  23  621.45     69.67        16       59      786             18
## 24  24  608.80     70.08        19       58      821              3
## 25  25  473.64     70.42        19       44      538             19
## 26  26  697.14     71.08        20       57     1130             21
## 27  27  207.51     67.25        13       63      745              8
## 28  28  288.48     67.17         9       48      821              7
## 29  29  284.88     67.83        12       63      886             11
## 30  30  280.36     67.83        12       71      886             11
## 31  31  217.38     67.25        13       72      745              8
## 32  32  270.71     67.83         7       80      886             11
NucSub$IDX <- seq_along(NucSub$IDX)
NucSub
##    IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2    1  452.99     67.33        10       73     1065              1
## 3    2  443.22     67.33        10       85     1065              1
## 7    3  272.37     68.17        12       50      822              5
## 8    4  317.21     68.42        14       59      457              1
## 9    5  457.12     68.42        15       55      822              5
## 11   6  350.63     68.58        12       64      560              3
## 13   7  412.18     68.42        15       62      530              2
## 14   8  495.58     68.92        17       52     1050              7
## 15   9  394.36     68.92        13       65      850             16
## 16  10  423.32     68.42        11       67      778              3
## 18  11  289.66     68.42        15       76      530              2
## 19  12  881.24     69.17        15       67     1090              1
## 20  13  490.88     68.92        16       59     1050              8
## 21  14  567.79     68.75        11       70      913             15
## 23  15  621.45     69.67        16       59      786             18
## 24  16  608.80     70.08        19       58      821              3
## 25  17  473.64     70.42        19       44      538             19
## 26  18  697.14     71.08        20       57     1130             21
## 27  19  207.51     67.25        13       63      745              8
## 28  20  288.48     67.17         9       48      821              7
## 29  21  284.88     67.83        12       63      886             11
## 30  22  280.36     67.83        12       71      886             11
## 31  23  217.38     67.25        13       72      745              8
## 32  24  270.71     67.83         7       80      886             11

Question 6

Displays happened in line above, but for completeness:

Nuc
##     X   cost  date t1 t2  cap pr ne ct bw cum.n pt
## 1   1 460.05 68.58 14 46  687  0  1  0  0    14  0
## 2   2 452.99 67.33 10 73 1065  0  0  1  0     1  0
## 3   3 443.22 67.33 10 85 1065  1  0  1  0     1  0
## 4   4 652.32 68.00 11 67 1065  0  1  1  0    12  0
## 5   5 642.23 68.00 11 78 1065  1  1  1  0    12  0
## 6   6 345.39 67.92 13 51  514  0  1  1  0     3  0
## 7   7 272.37 68.17 12 50  822  0  0  0  0     5  0
## 8   8 317.21 68.42 14 59  457  0  0  0  0     1  0
## 9   9 457.12 68.42 15 55  822  1  0  0  0     5  0
## 10 10 690.19 68.33 12 71  792  0  1  1  1     2  0
## 11 11 350.63 68.58 12 64  560  0  0  0  0     3  0
## 12 12 402.59 68.75 13 47  790  0  1  0  0     6  0
## 13 13 412.18 68.42 15 62  530  0  0  1  0     2  0
## 14 14 495.58 68.92 17 52 1050  0  0  0  0     7  0
## 15 15 394.36 68.92 13 65  850  0  0  0  1    16  0
## 16 16 423.32 68.42 11 67  778  0  0  0  0     3  0
## 17 17 712.27 69.50 18 60  845  0  1  0  0    17  0
## 18 18 289.66 68.42 15 76  530  1  0  1  0     2  0
## 19 19 881.24 69.17 15 67 1090  0  0  0  0     1  0
## 20 20 490.88 68.92 16 59 1050  1  0  0  0     8  0
## 21 21 567.79 68.75 11 70  913  0  0  1  1    15  0
## 22 22 665.99 70.92 22 57  828  1  1  0  0    20  0
## 23 23 621.45 69.67 16 59  786  0  0  1  0    18  0
## 24 24 608.80 70.08 19 58  821  1  0  0  0     3  0
## 25 25 473.64 70.42 19 44  538  0  0  1  0    19  0
## 26 26 697.14 71.08 20 57 1130  0  0  1  0    21  0
## 27 27 207.51 67.25 13 63  745  0  0  0  0     8  1
## 28 28 288.48 67.17  9 48  821  0  0  1  0     7  1
## 29 29 284.88 67.83 12 63  886  0  0  0  1    11  1
## 30 30 280.36 67.83 12 71  886  1  0  0  1    11  1
## 31 31 217.38 67.25 13 72  745  1  0  0  0     8  1
## 32 32 270.71 67.83  7 80  886  1  0  0  1    11  1
NucSub
##    IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2    1  452.99     67.33        10       73     1065              1
## 3    2  443.22     67.33        10       85     1065              1
## 7    3  272.37     68.17        12       50      822              5
## 8    4  317.21     68.42        14       59      457              1
## 9    5  457.12     68.42        15       55      822              5
## 11   6  350.63     68.58        12       64      560              3
## 13   7  412.18     68.42        15       62      530              2
## 14   8  495.58     68.92        17       52     1050              7
## 15   9  394.36     68.92        13       65      850             16
## 16  10  423.32     68.42        11       67      778              3
## 18  11  289.66     68.42        15       76      530              2
## 19  12  881.24     69.17        15       67     1090              1
## 20  13  490.88     68.92        16       59     1050              8
## 21  14  567.79     68.75        11       70      913             15
## 23  15  621.45     69.67        16       59      786             18
## 24  16  608.80     70.08        19       58      821              3
## 25  17  473.64     70.42        19       44      538             19
## 26  18  697.14     71.08        20       57     1130             21
## 27  19  207.51     67.25        13       63      745              8
## 28  20  288.48     67.17         9       48      821              7
## 29  21  284.88     67.83        12       63      886             11
## 30  22  280.36     67.83        12       71      886             11
## 31  23  217.38     67.25        13       72      745              8
## 32  24  270.71     67.83         7       80      886             11

SessionInfo

sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.2.0 curl_3.3     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1       knitr_1.23       magrittr_1.5     tidyselect_0.2.5
##  [5] munsell_0.5.0    colorspace_1.4-1 R6_2.4.0         rlang_0.4.0     
##  [9] stringr_1.4.0    dplyr_0.8.3      tools_3.6.0      grid_3.6.0      
## [13] gtable_0.3.0     xfun_0.8         withr_2.1.2      htmltools_0.3.6 
## [17] assertthat_0.2.1 yaml_2.2.0       lazyeval_0.2.2   digest_0.6.20   
## [21] tibble_2.1.3     crayon_1.3.4     purrr_0.3.2      glue_1.3.1      
## [25] evaluate_0.14    rmarkdown_1.13   stringi_1.4.3    compiler_3.6.0  
## [29] pillar_1.4.2     scales_1.0.0     pkgconfig_2.0.2