The data set under consideration is the nuclear data from the boot package. It contains 12 variables about 32 plants.
| Variable | Definition |
|---|---|
cost |
The capital cost of construction in millions of dollars adjusted to 1976 base. |
date |
The date on which the construction permit was issued. The data are measured in years since January 1 1990 to the nearest month. |
t1 |
The time between application for and issue of the construction permit. |
t2 |
The time between issue of operating license and construction permit. |
cap |
The net capacity of the power plant (MWe). |
pr |
A binary variable where 1 indicates the prior existence of a LWR plant at the same site. |
ne |
A binary variable where 1 indicates that the plant was constructed in the north-east region of the U.S.A. |
ct |
A binary variable where 1 indicates the use of a cooling tower in the plant. |
bw |
A binary variable where 1 indicates that the nuclear steam supply system was manufactured by Babcock-Wilcox. |
cum.n |
The cumulative number of power plants constructed by each architect-engineer. |
pt |
A binary variable where 1 indicates those plants with partial turnkey guarantees. |
summary(Nuc)
## X cost date t1
## Min. : 1.00 Min. :207.5 Min. :67.17 Min. : 7.00
## 1st Qu.: 8.75 1st Qu.:310.3 1st Qu.:67.90 1st Qu.:11.75
## Median :16.50 Median :448.1 Median :68.42 Median :13.00
## Mean :16.50 Mean :461.6 Mean :68.58 Mean :13.75
## 3rd Qu.:24.25 3rd Qu.:612.0 3rd Qu.:68.92 3rd Qu.:15.25
## Max. :32.00 Max. :881.2 Max. :71.08 Max. :22.00
## t2 cap pr ne
## Min. :44.00 Min. : 457.0 Min. :0.0000 Min. :0.00
## 1st Qu.:56.50 1st Qu.: 745.0 1st Qu.:0.0000 1st Qu.:0.00
## Median :62.50 Median : 822.0 Median :0.0000 Median :0.00
## Mean :62.38 Mean : 825.4 Mean :0.3125 Mean :0.25
## 3rd Qu.:70.25 3rd Qu.: 947.2 3rd Qu.:1.0000 3rd Qu.:0.25
## Max. :85.00 Max. :1130.0 Max. :1.0000 Max. :1.00
## ct bw cum.n pt
## Min. :0.0000 Min. :0.0000 Min. : 1.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 3.000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000 Median : 7.500 Median :0.0000
## Mean :0.4062 Mean :0.1875 Mean : 8.531 Mean :0.1875
## 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:12.500 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000 Max. :21.000 Max. :1.0000
The mean and median cost of a plant is 461.5603125 and 448.105 respectively. Similarly for capacity, the mean and median values are 825.375 and 822.
We will create a subset for all plants not in the NorthEast (rows) and disregard all binary indicator variables
colsub <- c('X', 'cost', 'date', 't1', 't2', 'cap', 'cum.n')
NucSub <- Nuc[Nuc$ne == 0, colsub]
names(NucSub) <- c('IDX', 'CapCost', 'ConstDate', 'App2Issue', 'Op2Issue', 'Capacity', 'NumConstructed')
summary(NucSub)
## IDX CapCost ConstDate App2Issue
## Min. : 2.00 Min. :207.5 Min. :67.17 Min. : 7.00
## 1st Qu.:12.50 1st Qu.:287.6 1st Qu.:67.83 1st Qu.:11.75
## Median :19.50 Median :417.8 Median :68.42 Median :13.00
## Mean :18.79 Mean :425.0 Mean :68.53 Mean :13.58
## 3rd Qu.:26.25 3rd Qu.:492.1 3rd Qu.:68.92 3rd Qu.:15.25
## Max. :32.00 Max. :881.2 Max. :71.08 Max. :20.00
## Op2Issue Capacity NumConstructed
## Min. :44.00 Min. : 457.0 Min. : 1.000
## 1st Qu.:57.75 1st Qu.: 745.0 1st Qu.: 2.750
## Median :63.00 Median : 822.0 Median : 7.000
## Mean :63.29 Mean : 826.1 Mean : 7.792
## 3rd Qu.:70.25 3rd Qu.: 947.2 3rd Qu.:11.000
## Max. :85.00 Max. :1130.0 Max. :21.000
The mean and median cost of a plant in this subset is 424.9541667 and 417.75 respectively. Similarly for capacity, the new mean and median values are 826.0833333 and 822. The mean and median costs are lower, but the mean and median capacity is pretty much unchanged.
Since this data set has no character values, renaming will be performed by re-indexing starting from 1.
NucSub
## IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2 2 452.99 67.33 10 73 1065 1
## 3 3 443.22 67.33 10 85 1065 1
## 7 7 272.37 68.17 12 50 822 5
## 8 8 317.21 68.42 14 59 457 1
## 9 9 457.12 68.42 15 55 822 5
## 11 11 350.63 68.58 12 64 560 3
## 13 13 412.18 68.42 15 62 530 2
## 14 14 495.58 68.92 17 52 1050 7
## 15 15 394.36 68.92 13 65 850 16
## 16 16 423.32 68.42 11 67 778 3
## 18 18 289.66 68.42 15 76 530 2
## 19 19 881.24 69.17 15 67 1090 1
## 20 20 490.88 68.92 16 59 1050 8
## 21 21 567.79 68.75 11 70 913 15
## 23 23 621.45 69.67 16 59 786 18
## 24 24 608.80 70.08 19 58 821 3
## 25 25 473.64 70.42 19 44 538 19
## 26 26 697.14 71.08 20 57 1130 21
## 27 27 207.51 67.25 13 63 745 8
## 28 28 288.48 67.17 9 48 821 7
## 29 29 284.88 67.83 12 63 886 11
## 30 30 280.36 67.83 12 71 886 11
## 31 31 217.38 67.25 13 72 745 8
## 32 32 270.71 67.83 7 80 886 11
NucSub$IDX <- seq_along(NucSub$IDX)
NucSub
## IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2 1 452.99 67.33 10 73 1065 1
## 3 2 443.22 67.33 10 85 1065 1
## 7 3 272.37 68.17 12 50 822 5
## 8 4 317.21 68.42 14 59 457 1
## 9 5 457.12 68.42 15 55 822 5
## 11 6 350.63 68.58 12 64 560 3
## 13 7 412.18 68.42 15 62 530 2
## 14 8 495.58 68.92 17 52 1050 7
## 15 9 394.36 68.92 13 65 850 16
## 16 10 423.32 68.42 11 67 778 3
## 18 11 289.66 68.42 15 76 530 2
## 19 12 881.24 69.17 15 67 1090 1
## 20 13 490.88 68.92 16 59 1050 8
## 21 14 567.79 68.75 11 70 913 15
## 23 15 621.45 69.67 16 59 786 18
## 24 16 608.80 70.08 19 58 821 3
## 25 17 473.64 70.42 19 44 538 19
## 26 18 697.14 71.08 20 57 1130 21
## 27 19 207.51 67.25 13 63 745 8
## 28 20 288.48 67.17 9 48 821 7
## 29 21 284.88 67.83 12 63 886 11
## 30 22 280.36 67.83 12 71 886 11
## 31 23 217.38 67.25 13 72 745 8
## 32 24 270.71 67.83 7 80 886 11
Displays happened in line above, but for completeness:
Nuc
## X cost date t1 t2 cap pr ne ct bw cum.n pt
## 1 1 460.05 68.58 14 46 687 0 1 0 0 14 0
## 2 2 452.99 67.33 10 73 1065 0 0 1 0 1 0
## 3 3 443.22 67.33 10 85 1065 1 0 1 0 1 0
## 4 4 652.32 68.00 11 67 1065 0 1 1 0 12 0
## 5 5 642.23 68.00 11 78 1065 1 1 1 0 12 0
## 6 6 345.39 67.92 13 51 514 0 1 1 0 3 0
## 7 7 272.37 68.17 12 50 822 0 0 0 0 5 0
## 8 8 317.21 68.42 14 59 457 0 0 0 0 1 0
## 9 9 457.12 68.42 15 55 822 1 0 0 0 5 0
## 10 10 690.19 68.33 12 71 792 0 1 1 1 2 0
## 11 11 350.63 68.58 12 64 560 0 0 0 0 3 0
## 12 12 402.59 68.75 13 47 790 0 1 0 0 6 0
## 13 13 412.18 68.42 15 62 530 0 0 1 0 2 0
## 14 14 495.58 68.92 17 52 1050 0 0 0 0 7 0
## 15 15 394.36 68.92 13 65 850 0 0 0 1 16 0
## 16 16 423.32 68.42 11 67 778 0 0 0 0 3 0
## 17 17 712.27 69.50 18 60 845 0 1 0 0 17 0
## 18 18 289.66 68.42 15 76 530 1 0 1 0 2 0
## 19 19 881.24 69.17 15 67 1090 0 0 0 0 1 0
## 20 20 490.88 68.92 16 59 1050 1 0 0 0 8 0
## 21 21 567.79 68.75 11 70 913 0 0 1 1 15 0
## 22 22 665.99 70.92 22 57 828 1 1 0 0 20 0
## 23 23 621.45 69.67 16 59 786 0 0 1 0 18 0
## 24 24 608.80 70.08 19 58 821 1 0 0 0 3 0
## 25 25 473.64 70.42 19 44 538 0 0 1 0 19 0
## 26 26 697.14 71.08 20 57 1130 0 0 1 0 21 0
## 27 27 207.51 67.25 13 63 745 0 0 0 0 8 1
## 28 28 288.48 67.17 9 48 821 0 0 1 0 7 1
## 29 29 284.88 67.83 12 63 886 0 0 0 1 11 1
## 30 30 280.36 67.83 12 71 886 1 0 0 1 11 1
## 31 31 217.38 67.25 13 72 745 1 0 0 0 8 1
## 32 32 270.71 67.83 7 80 886 1 0 0 1 11 1
NucSub
## IDX CapCost ConstDate App2Issue Op2Issue Capacity NumConstructed
## 2 1 452.99 67.33 10 73 1065 1
## 3 2 443.22 67.33 10 85 1065 1
## 7 3 272.37 68.17 12 50 822 5
## 8 4 317.21 68.42 14 59 457 1
## 9 5 457.12 68.42 15 55 822 5
## 11 6 350.63 68.58 12 64 560 3
## 13 7 412.18 68.42 15 62 530 2
## 14 8 495.58 68.92 17 52 1050 7
## 15 9 394.36 68.92 13 65 850 16
## 16 10 423.32 68.42 11 67 778 3
## 18 11 289.66 68.42 15 76 530 2
## 19 12 881.24 69.17 15 67 1090 1
## 20 13 490.88 68.92 16 59 1050 8
## 21 14 567.79 68.75 11 70 913 15
## 23 15 621.45 69.67 16 59 786 18
## 24 16 608.80 70.08 19 58 821 3
## 25 17 473.64 70.42 19 44 538 19
## 26 18 697.14 71.08 20 57 1130 21
## 27 19 207.51 67.25 13 63 745 8
## 28 20 288.48 67.17 9 48 821 7
## 29 21 284.88 67.83 12 63 886 11
## 30 22 280.36 67.83 12 71 886 11
## 31 23 217.38 67.25 13 72 745 8
## 32 24 270.71 67.83 7 80 886 11
sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_3.2.0 curl_3.3
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.1 knitr_1.23 magrittr_1.5 tidyselect_0.2.5
## [5] munsell_0.5.0 colorspace_1.4-1 R6_2.4.0 rlang_0.4.0
## [9] stringr_1.4.0 dplyr_0.8.3 tools_3.6.0 grid_3.6.0
## [13] gtable_0.3.0 xfun_0.8 withr_2.1.2 htmltools_0.3.6
## [17] assertthat_0.2.1 yaml_2.2.0 lazyeval_0.2.2 digest_0.6.20
## [21] tibble_2.1.3 crayon_1.3.4 purrr_0.3.2 glue_1.3.1
## [25] evaluate_0.14 rmarkdown_1.13 stringi_1.4.3 compiler_3.6.0
## [29] pillar_1.4.2 scales_1.0.0 pkgconfig_2.0.2