1 Summary

1.1 neural anlysis

R의 경우 3개 이상의 신경망 패키지가 제공되는데 neuralnet, nnet, rnns 가 대표적 패키지임

neuralnet 패키지는 회귀문제를 위한 신경망 nnet 패키지를 이용한 분류문제 해결

1.2 Data

1.2.1 Data_1

[대표적 데이터 공유 사이트] https://archive.ics.uci.edu/ml/index.php

Used data : Combined Cycle Power PlantGrowth

Data Set Information:

The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant. A combined cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST) and heat recovery steam generators. In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is colected from and has effect on the Steam Turbine, he other three of the ambient variables effect the GT performance. For comparability with our baseline studies, and to allow 5x2 fold statistical tests be carried out, we provide the data shuffled five times. For each shuffling 2-fold CV is carried out and the resulting 10 measurements are used for statistical testing. We provide the data both in .ods and in .xlsx formats.

Attribute Information:

Features consist of hourly average ambient variables - Temperature (T) in the range 1.81°C and 37.11°C, - Ambient Pressure (AP) in the range 992.89-1033.30 milibar, - Relative Humidity (RH) in the range 25.56% to 100.16% - Exhaust Vacuum (V) in teh range 25.36-81.56 cm Hg - Net hourly electrical energy output (EP) 420.26-495.76 MW The averages are taken from various sensors located around the plant that record the ambient variables every second. The variables are given without normalization.

1.2.2 Data_2

iris3 : Edgar Anderson’s Iris Data

iris3 gives the same data arranged as a 3-dimensional array of size 50 by 4 by 3, as represented by S-PLUS. The first dimension gives the case number within the species subsample, the second the measurements with names Sepal L., Sepal W., Petal L., and Petal W., and the third the species.


2 Analysis

2.1 Neuralnet Analysis

2.1.1 Packages Loading

library(readxl)
library(neuralnet)
library(nnet)

2.1.2 Data loading

mydata <- read_xlsx("Folds5x2_pp.xlsx")
# mydata_2 <- read_xlsx("Folds5x2_pp.xlsx", sheet = 2)
summary(mydata)
##        AT              V               AP               RH        
##  Min.   : 1.81   Min.   :25.36   Min.   : 992.9   Min.   : 25.56  
##  1st Qu.:13.51   1st Qu.:41.74   1st Qu.:1009.1   1st Qu.: 63.33  
##  Median :20.34   Median :52.08   Median :1012.9   Median : 74.97  
##  Mean   :19.65   Mean   :54.31   Mean   :1013.3   Mean   : 73.31  
##  3rd Qu.:25.72   3rd Qu.:66.54   3rd Qu.:1017.3   3rd Qu.: 84.83  
##  Max.   :37.11   Max.   :81.56   Max.   :1033.3   Max.   :100.16  
##        PE       
##  Min.   :420.3  
##  1st Qu.:439.8  
##  Median :451.6  
##  Mean   :454.4  
##  3rd Qu.:468.4  
##  Max.   :495.8

2.1.3 Data normalize

normalize <- function(x) {
  return ((x - min(x)) / (max(x) - min(x)) * 2 - 1)
}

mydata_norm <- as.data.frame(lapply(mydata, normalize))
summary(mydata_norm)
##        AT                 V                  AP           
##  Min.   :-1.00000   Min.   :-1.00000   Min.   :-1.000000  
##  1st Qu.:-0.33711   1st Qu.:-0.41708   1st Qu.:-0.197723  
##  Median : 0.05014   Median :-0.04911   Median :-0.007671  
##  Mean   : 0.01083   Mean   : 0.03010   Mean   : 0.008121  
##  3rd Qu.: 0.35467   3rd Qu.: 0.46548   3rd Qu.: 0.206137  
##  Max.   : 1.00000   Max.   : 1.00000   Max.   : 1.000000  
##        RH                 PE          
##  Min.   :-1.00000   Min.   :-1.00000  
##  1st Qu.: 0.01253   1st Qu.:-0.48371  
##  Median : 0.32480   Median :-0.17113  
##  Mean   : 0.28013   Mean   :-0.09656  
##  3rd Qu.: 0.58901   3rd Qu.: 0.27603  
##  Max.   : 1.00000   Max.   : 1.00000

2.1.4 Sampling & review

samp <- sample(1:9568, 5000)
mydataTR <- mydata_norm[samp,]
mydataTS <- mydata_norm[-samp,]

boxplot(mydataTR)

boxplot(mydataTS)


2.1.5 Analysis

mynn_H1 <- neuralnet(PE ~ AT + V + AP + RH, data = mydataTR)
str(mynn_H1)
## List of 14
##  $ call               : language neuralnet(formula = PE ~ AT + V + AP + RH, data = mydataTR)
##  $ response           : num [1:5000, 1] -0.619 -0.493 0.695 0.263 -0.668 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:5000] "5894" "3274" "4239" "5162" ...
##   .. ..$ : chr "PE"
##  $ covariate          : num [1:5000, 1:4] 0.564 0.414 -0.789 -0.338 0.708 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:5000] "5894" "3274" "4239" "5162" ...
##   .. ..$ : chr [1:4] "AT" "V" "AP" "RH"
##  $ model.list         :List of 2
##   ..$ response : chr "PE"
##   ..$ variables: chr [1:4] "AT" "V" "AP" "RH"
##  $ err.fct            :function (x, y)  
##   ..- attr(*, "type")= chr "sse"
##  $ act.fct            :function (x)  
##   ..- attr(*, "type")= chr "logistic"
##  $ linear.output      : logi TRUE
##  $ data               :'data.frame': 5000 obs. of  5 variables:
##   ..$ AT: num [1:5000] 0.564 0.414 -0.789 -0.338 0.708 ...
##   ..$ V : num [1:5000] 0.763 -0.158 -0.629 -0.416 0.617 ...
##   ..$ AP: num [1:5000] -0.4447 -0.2541 0.0191 -0.1205 -0.2334 ...
##   ..$ RH: num [1:5000] 0.0448 0.1928 0.8399 0.0129 0.0912 ...
##   ..$ PE: num [1:5000] -0.619 -0.493 0.695 0.263 -0.668 ...
##  $ exclude            : NULL
##  $ net.result         :List of 1
##   ..$ : num [1:5000, 1] -0.659 -0.436 0.723 0.342 -0.713 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:5000] "5894" "3274" "4239" "5162" ...
##   .. .. ..$ : NULL
##  $ weights            :List of 1
##   ..$ :List of 2
##   .. ..$ : num [1:5, 1] -0.582 -1.4594 -0.3421 0.0645 -0.2151
##   .. ..$ : num [1:2, 1] -1.09 2.8
##  $ generalized.weights:List of 1
##   ..$ : num [1:5000, 1:4] 0.487 1.169 -4.672 -4.544 0.391 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:5000] "5894" "3274" "4239" "5162" ...
##   .. .. ..$ : NULL
##  $ startweights       :List of 1
##   ..$ :List of 2
##   .. ..$ : num [1:5, 1] 1.02 -0.499 -1.933 -0.604 0.953
##   .. ..$ : num [1:2, 1] -0.0555 2.761
##  $ result.matrix      : num [1:10, 1] 3.23e+01 8.88e-03 9.82e+03 -5.82e-01 -1.46 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:10] "error" "reached.threshold" "steps" "Intercept.to.1layhid1" ...
##   .. ..$ : NULL
##  - attr(*, "class")= chr "nn"
# err.fct : 오차함수, act.fct : 활성화함수(logistic은 시그모이드 함수 의미)
# net.result : 신경망 출력값, weights: 학습된 가중치, start.weights: 초기 가중치
# linear,out=TRUE : 출력 노드의 활성화 함수로 선형함수가 사용됨을 의미

# ReLU : 0까지는 0, 이후는 선형형

plot(mynn_H1)

# TS를 이용한 평가
mynn_H1_resTS <- compute(mynn_H1, mydataTS[ ,1:4])
sqr_err <- (mynn_H1_resTS$net.result - mydataTS[ ,5]) ^2
mes <- sum(sqr_err)/length(sqr_err) # 평균제곱오차

mes 
## [1] 0.01354148
cor(mydataTS[ ,5], mynn_H1_resTS$net.result)
##           [,1]
## [1,] 0.9664969

# hidden layer 3개인 경우
mynn_H3 <- neuralnet(PE ~ AT + V + AP + RH, data = mydataTR, hidden=3) #은닉노드수 3개
plot(mynn_H3)

mynn_H3_resTS <- compute(mynn_H3, mydataTS[ ,1:4])
sqr_err_H3 <- (mynn_H3_resTS$net.result - mydataTS[ ,5]) ^2
mes_H3 <- sum(sqr_err)/length(sqr_err) # 평균제곱오차

mes_H3 
## [1] 0.01354148
cor(mydataTS[ ,5], mynn_H3_resTS$net.result)
##           [,1]
## [1,] 0.9688278

2.2 Nnet Analysis

2.2.1 Data loading

data(iris3)
str(iris3)
##  num [1:50, 1:4, 1:3] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  - attr(*, "dimnames")=List of 3
##   ..$ : NULL
##   ..$ : chr [1:4] "Sepal L." "Sepal W." "Petal L." "Petal W."
##   ..$ : chr [1:3] "Setosa" "Versicolor" "Virginica"
iris_input <- rbind(iris3[,,1], iris3[,,2], iris3[,,3])
summary(iris_input)
##     Sepal L.        Sepal W.        Petal L.        Petal W.    
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500
iris_out <- class.ind(c(rep("s", 50), rep("c", 50), rep("v", 50)))

2.2.2 Sampling & review

samp <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25))

iris_inputTR <- iris_input[samp, ]
iris_inputTS <- iris_input[-samp, ]
iris_outTR <- iris_out[samp, ]
iris_outTS <- iris_out[-samp, ]

boxplot(iris_inputTR)

boxplot(iris_inputTS)


2.2.3 Analysis

irisnet <- nnet(iris_inputTR, iris_outTR, size = 3, 
                rang = 0.1, abstol = 0.00001, maxit = 500)
## # weights:  27
## initial  value 55.812151 
## iter  10 value 28.207018
## iter  20 value 0.094684
## iter  30 value 0.000933
## final  value 0.000007 
## converged
# size: 은닉노드 수, rang: 초기 가중치가 가지는 최대값
# abstol: 학습완료조건(학습오차의 감소폭이 0.00001보다 작으면 완료)
# maxit: 학습 횟수(설정값에 도달하면 학습 멈춤)

# 위의 경우 80번 학습때 오차가 0.00001보다 작아져 학습 완료(converged 메세지 출력)
# weights: 27은 가중치 파라미터의 개수
# 최종적으로 얻어진 학습오차(제곱오차)는 1.923425

str(irisnet)
## List of 15
##  $ n            : num [1:3] 4 3 3
##  $ nunits       : int 11
##  $ nconn        : num [1:12] 0 0 0 0 0 0 5 10 15 19 ...
##  $ conn         : num [1:27] 0 1 2 3 4 0 1 2 3 4 ...
##  $ nsunits      : int 11
##  $ decay        : num 0
##  $ entropy      : logi FALSE
##  $ softmax      : logi FALSE
##  $ censored     : logi FALSE
##  $ value        : num 7.28e-06
##  $ wts          : num [1:27] 1.49 1.19 -5.46 3.06 1.64 ...
##  $ convergence  : int 0
##  $ fitted.values: num [1:75, 1:3] 5.30e-06 5.25e-06 4.04e-06 5.16e-06 4.13e-06 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:3] "c" "s" "v"
##  $ residuals    : num [1:75, 1:3] -5.30e-06 -5.25e-06 -4.04e-06 -5.16e-06 -4.13e-06 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:3] "c" "s" "v"
##  $ call         : language nnet.default(x = iris_inputTR, y = iris_outTR, size = 3, rang = 0.1,      maxit = 500, abstol = 1e-05)
##  - attr(*, "class")= chr "nnet"
# $n : 노드수(입력층-은닉층-출력층)
# $decay: 과다학습 방지를 위한 정규화하 반영도
# $entropy: 크로스엔트로피 오차 사용 여부
# $softmax: 소프트맥스 활성화함수 사용 여부
# $convergence: 학습오차가 수렴되었는지 여부 
# entropy, softmax의 변수값이 모두 FALSE가 되었다는 것은 엔트로피 오차와 소프트맥스 활성화함수를 사용하지 않고 일반적인 제곱오차와 시그모이드 함수를 사용하였음을 의미함

# 평가
irisTS_res <- predict(irisnet, iris_inputTS)
str(irisTS_res)
##  num [1:75, 1:3] 4.21e-06 5.17e-06 4.05e-06 4.19e-06 4.17e-06 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:3] "c" "s" "v"
clabel_nn <- max.col(irisTS_res) # 각 행렬별로 최대값을 가지는 열 찾기
clabel_true <- max.col(iris_outTS)

err_cnt <- ((clabel_nn-clabel_true)!=0)
Ecls <- sum(err_cnt)/75   # Ecls: 오분류율

Ecls
## [1] 0.04
table(clabel_true, clabel_nn)
##            clabel_nn
## clabel_true  1  2  3
##           1 22  0  3
##           2  0 25  0
##           3  0  0 25

# 크로스엔트로피 오차함수를 사용하고자 할때
irisnet_SM <- nnet(
  iris_inputTR,
  iris_outTR,
  size = 3,
  entropy = TRUE,
  softmax = TRUE,
  rang = 0.1,
  maxit = 500
)