함수형 프로그래밍이란?
반복문(for loop)와 벡터 연산
재귀호출 : 하노이의 탑
클로져의 사용
- 문자열을 “+” 또는 “,”로 병합
Functionals : 함수의 인수로 함수 사용하기
subset의 회귀식 구하기
- 절차적 프로그래밍
- plyr패키지를 이용한 함수형 프로그래밍

함수형 프로그래밍이란?

위키피디아 정의 : 함수형 프로그래밍(Functional Programming)은 자료 처리를 수학적 함수의 계산으로 취급하고 상태와 가변 데이터를 멀리하는 프로그래밍 패러다임의 하나이다. 명령형 프로그래밍에서는 상태를 바꾸는 것을 강조하는 것과는 달리, 함수형 프로그래밍은 함수의 응용을 강조한다. 함수형 프로그래밍은 1930년대에 계산가능성, 결정문제, 함수정의, 함수응용과 재귀를 연구하기 위해 개발된 형식체계인 람다 대수에 근간을 두고 있다. 다수의 함수형 프로그래밍 언어들은 람다 연산을 발전시킨 것으로 볼 수 있다.

R과 함수형 프로그래밍

Hadley Wicham : R은 본질적으로 함수형 프로그래밍 언어이다. R은 함수를 만들고 함수를 다루는 많은 도구를 제공한다. R은 벡터로 할 수 있는 모든 것을 함수로 할 수 있다. 함수를 변수에 할당하고 리스트에 저장하고 다른 함수의 인수로 넘겨주고 함수 안에서 함수를 만들고 함수의 결과로써 함수를 반환할 수도 있다.

함수형 프로그래밍을 위해 R에서 공부해야 개념들

무기명함수
클로져(함수에 의해 만들어지는 함수)
함수리스트
Functionals(함수을 인수로 받아들이고 벡터를 반환하는 함수)
Function operator(함수를 인수로 받아들이고 함수를 반환하는 함수)

오늘 다룰 예제

반복문과 벡터연산
재귀호출
클로져의 사용 예
1. 문자열을 “+”로 병합
Functionals의 사용 예
1. NA값을 평균 값을 입력
2. 에러의 처리 5.subset의 회귀식 구하기

반복문(for loop)와 벡터 연산

moonBook 패키지에는 857명의 환자 자료가 있다. 자료의 갯수를 증가시키기 위해 rbind함수를 이용한다.

require(moonBook)

data(acs)
nrow(acs)

[1] 857

for(i in 1:7) acs=rbind(acs,acs)
nrow(acs)

[1] 109696

모두 109696예의 자료를 이용해 for loop와 벡터연산을 비교해본다.

for loop를 사용하는 연산

BMI<-c()
system.time(
    for(i in 1:nrow(acs)) {
    BMI <- c(BMI,acs$weight[i]*10000/acs$height[i])
    }
)

   user  system elapsed 
 24.123   4.362  28.495

벡터 연산

system.time(BMI<-acs$weight*10000/acs$height)

   user  system elapsed 
  0.001   0.001   0.001

재귀호출 : 하노이의 탑

Tower of hanoi

#' Demonstrate the Tower of Hanoi puzzle in R
#'
#' This function uses the recursive algorithm to solve the Tower of Hanoi
#' puzzle, and demonstrates the game in animation.
#'
#' This function was written by Linlin Yan <linlin.yan@@cos.name> in a Chinese
#' forum (See 'References') to show the usage of recursive algorithm.
#' @param n an integer indicating the number of disks on the rot.
#' @seealso \code{\link[graphics]{barplot}}
#' @references Original code: \url{http://cos.name/cn/topic/101199}
#'
#'   About the Tower of Hanoi: \url{http://en.wikipedia.org/wiki/Tower_of_Hanoi}
#' @author Linlin Yan <\email{linlin.yan@@cos.name}>
#' }
tower_of_hanoi <- function(n = 7) {
    if (!interactive()) return()
    tower <- list(1:n, NULL, NULL)
    color <- rainbow(n)
    par(mfrow = c(1, 3), mar = rep(0, 4), ann = FALSE)
    bgcolor <- par("bg")
    if (bgcolor == "transparent") bgcolor <- "white"
    
    draw.hanoi <- function() {
        for (i in 1:3) {
            plot(c(-n, n), c(0, n + 2), type = "n", xlab = "",
                 ylab = "", axes = FALSE)
            rect(-n, 0, n, n + 2, border = bgcolor, col = bgcolor)
            if (length(tower[[i]]) > 0) {
                barplot(rev(tower[[i]]), add = TRUE, horiz = TRUE,
                        col = color[rev(tower[[i]])])
                barplot(-rev(tower[[i]]), add = TRUE, horiz = TRUE,
                        col = color[rev(tower[[i]])])
            }
        }
    }
    
    move.hanoi <- function(k, from, to, via) {
        if (k > 1) {
            move.hanoi(k - 1, from, via, to)
            move.hanoi(1, from, to, via)
            move.hanoi(k - 1, via, to, from)
        }
        else {
            cat("Move ", tower[[from]][1], " from ", LETTERS[from],
                " to ", LETTERS[to], "\n")
            tower[[to]] <<- c(tower[[from]][1], tower[[to]])
            tower[[from]] <<- tower[[from]][-1]
            draw.hanoi()
            Sys.sleep(0.5)
        }
    }
    
    draw.hanoi()
    move.hanoi(n, 1, 2, 3)
    par(mfrow = c(1, 1))
}

tower_of_hanoi(3)
tower_of_hanoi(7)

클로져의 사용

문자열을 “+” 또는 “,”로 병합

다음과 같은 문자열 벡터가 있을 때 각각의 문자열을 “+”기호로 연결한 문자열을 만들고 싶다. 예를 들어 “A”,“B”,“C”의 세개의 원소를 갖는 문자열 벡터가 있을때 “A+B+C”라는 문자열을 만들고 싶으면 다음과 같이 할수 있다.

x=c("A","B","C")

result=paste0(x[1],"+",x[2],"+",x[3])
result

[1] "A+B+C"

하지만 문자열의 갯수가 많아지면 위와 같은 방법은 매우 번거롭다. 예를들어 “January”부터 “December”까지의 문자들을 모두 “+”로 연결하려면 어떻게 할까?

month.name

 [1] "January"   "February"  "March"     "April"     "May"      
 [6] "June"      "July"      "August"    "September" "October"  
[11] "November"  "December"

result=month.name[1]
for(i in 2:length(month.name)) result=paste0(result,"+",month.name[i])
result

[1] "January+February+March+April+May+June+July+August+September+October+November+December"

이러한 반복작업인 경우 Reduce 함수를 쓰면 코드의 양이 줄어들고 불필요한 loop를 없앨 수 있다. 먼저 두개의 문자열을 “+”로 연결하는 함수를 만든다.

pasteadd=function(...){
    paste(...,sep="+")
}
Reduce(pasteadd,month.name)

[1] "January+February+March+April+May+June+July+August+September+October+November+December"

이번에는 문자열을 “-”로 연결하는 함수와 “,”로 연결하는 함수, “:”과 “;”으로 연결하는 함수를 만들고 싶으면 어떻게 할까? 클로져를 쓰면 쉽게 해결할 수 있다.

pasteSep=function(sep){
    function(...){
       paste(...,sep=sep)
    }
}
pasteadd=pasteSep("+")
pasteminus=pasteSep("-")
pastecomma=pasteSep(",")
pastecolon=pasteSep(":")
pastesemicolon=pasteSep(";")

Reduce(pasteadd,month.name)

[1] "January+February+March+April+May+June+July+August+September+October+November+December"

Reduce(pastecomma,month.name)

[1] "January,February,March,April,May,June,July,August,September,October,November,December"

Reduce(pastecolon,month.name)

[1] "January:February:March:April:May:June:July:August:September:October:November:December"

Reduce(pastesemicolon,month.name)

[1] "January;February;March;April;May;June;July;August;September;October;November;December"

Functionals : 함수의 인수로 함수 사용하기

연습문제 1. NA값을 평균 값으로 입력하기

다음과 같은 행렬이 있다고 하자

set.seed(124)
mat=matrix(sample(100,size=60,replace=TRUE),nrow=6)
mat

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    9   59   76   89   35   83   20   68   45    25
[2,]   41   50   86    4   21   31   94   25   15    37
[3,]   52   93   41   63   85   44   15   42   43    24
[4,]   40   28    6   61   27   87   32   33   84    75
[5,]   23   78   58    8   84   12   14   96   53    20
[6,]   30   86   75   42   39   97   83   25   22    57

이중 임의의 10개 값을 NA로 만든다.

mat[sample(60,10)]=NA
mat

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]   NA   59   76   NA   35   83   20   68   45    25
[2,]   41   50   86    4   21   NA   94   25   15    37
[3,]   NA   93   41   63   NA   44   15   42   43    24
[4,]   40   NA    6   61   27   87   32   33   84    75
[5,]   23   78   58    8   84   12   14   96   53    NA
[6,]   30   86   NA   42   NA   97   83   25   22    NA

이 행렬의 NA 값을 그 열의 평균값으로 바꾸려면 어떻게 할까?

정답 1

힌트 ; 행렬이 아닌 벡터라면 어떻게 할까?

예를 들어 다음과 같은 자료가 있을때

x=c(1,3,5,NA)

is.na() 함수는 R 객체 x에서 NA값이 있을때 TRUE를 반환한다.

is.na(x)

[1] FALSE FALSE FALSE  TRUE

그럼 x 중 NA값은 x[is.na(x)]로 하면 된다. x의 평균을 mean(x)로 해보면 NA 값이 있는 경우 계산이 안되므로 NA 값을 제외하고 평균을 구하려면 mean(x,na.rm=TRUE)로 하면 된다.

mean(x)

[1] NA

mean(x,na.rm=TRUE)

[1] 3

NA값이 있는 x에 x의 평균값을 집어 넣으려면 다음과 같이 하면 된다.

x[is.na(x)]=mean(x,na.rm=T)
x

[1] 1 3 5 3

이제 NA값을 평균으로 바꿔주는 일을 하는 함수를 하나 만들어보면 다음과 같다.

repNA=function(x) {
       x[is.na(x)]=mean(x,na.rm=T)
       x
}

함수가 제대로 작동하는지 테스트해본다.

x=c(1,3,5,NA)
repNA(x)

[1] 1 3 5 3

그럼 문제의 행렬을 만들고 apply함수를 적용해보면 다음과 같다.

set.seed(124)
mat=matrix(sample(100,size=60,replace=TRUE),nrow=6)
mat[sample(60,10)]=NA
mat

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]   NA   59   76   NA   35   83   20   68   45    25
[2,]   41   50   86    4   21   NA   94   25   15    37
[3,]   NA   93   41   63   NA   44   15   42   43    24
[4,]   40   NA    6   61   27   87   32   33   84    75
[5,]   23   78   58    8   84   12   14   96   53    NA
[6,]   30   86   NA   42   NA   97   83   25   22    NA

apply(mat,2,repNA)

     [,1] [,2] [,3] [,4]  [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 33.5 59.0 76.0 35.6 35.00 83.0   20   68   45 25.00
[2,] 41.0 50.0 86.0  4.0 21.00 64.6   94   25   15 37.00
[3,] 33.5 93.0 41.0 63.0 41.75 44.0   15   42   43 24.00
[4,] 40.0 73.2  6.0 61.0 27.00 87.0   32   33   84 75.00
[5,] 23.0 78.0 58.0  8.0 84.00 12.0   14   96   53 40.25
[6,] 30.0 86.0 53.4 42.0 41.75 97.0   83   25   22 40.25

repNA함수를 무기명함수로 축약해서 쓴다면

function(x) { x[is.na(x)]=mean(x,na.rm=TRUE);x}

로 축약해서 쓸 수 있다.참고로 열의 평균이 아닌 행의 평균으로 NA값을 바꾸려면

apply(mat,1,  function(x) { x[is.na(x)]=mean(x,na.rm=TRUE);x})

        [,1]     [,2]   [,3]     [,4]     [,5] [,6]
 [1,] 51.375 41.00000 45.625 40.00000 23.00000   30
 [2,] 59.000 50.00000 93.000 49.44444 78.00000   86
 [3,] 76.000 86.00000 41.000  6.00000 58.00000   55
 [4,] 51.375  4.00000 63.000 61.00000  8.00000   42
 [5,] 35.000 21.00000 45.625 27.00000 84.00000   55
 [6,] 83.000 41.44444 44.000 87.00000 12.00000   97
 [7,] 20.000 94.00000 15.000 32.00000 14.00000   83
 [8,] 68.000 25.00000 42.000 33.00000 96.00000   25
 [9,] 45.000 15.00000 43.000 84.00000 53.00000   22
[10,] 25.000 37.00000 24.000 75.00000 47.33333   55

에러처리

카이제곱 검정을 할 때 표본수가 너무 적으면 결과가 부정확할 수 있고 그런 경우 Fisher’s exact test를 사용한다. 다음의 예를보자.

x1 <- matrix(c(1,2,
               3,4),nrow=2,byrow=TRUE)
chisq.test(x1)

Warning in chisq.test(x1): Chi-squared approximation may be incorrect


    Pearson's Chi-squared test with Yates' continuity correction

data:  x1
X-squared = 2.9348e-32, df = 1, p-value = 1

fisher.test(x1)


    Fisher's Exact Test for Count Data

data:  x1
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  0.008512238 20.296715040
sample estimates:
odds ratio 
  0.693793

x2 <- matrix(c(11,12,
               15,10),nrow=2,byrow=TRUE)
chisq.test(x2)


    Pearson's Chi-squared test with Yates' continuity correction

data:  x2
X-squared = 0.30881, df = 1, p-value = 0.5784

fisher.test(x2)


    Fisher's Exact Test for Count Data

data:  x2
p-value = 0.5627
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.1671249 2.2165248
sample estimates:
odds ratio 
 0.6174675

통계의 자동화를 위해 카이제곱검정 결과가 문제가 없으면 카이제곱 검정 결과를 보여주고 warning이 나오면 자동으로 fisher’s test를 시행하는 함수를 만들려면 어떻게 해야 할까? tryCatch()함수를 사용하면 에러, 경고, 메시지 등이 있을 때 각각의 처리함수를 정해 수행되게 할수 있다. 다음의 예를 보자.

show_condition <- function(code) {
  tryCatch(code,
    error = function(c) "error",
    warning = function(c) "warning",
    message = function(c) "message"
  )
}

다음의 예를 보자.

show_condition(stop("!"))

[1] "error"

show_condition(warning("?!"))

[1] "warning"

show_condition(message("?"))

[1] "message"

error나 warning, message가 없을 때는 입력의 결과를 반환한다.

show_condition(10)

[1] 10

그렇다면 chisq.test()는 어떤 값을 반환할까?

# Warning이 발생한다.
result=chisq.test(x1)

Warning in chisq.test(x1): Chi-squared approximation may be incorrect

result


    Pearson's Chi-squared test with Yates' continuity correction

data:  x1
X-squared = 2.9348e-32, df = 1, p-value = 1

str(result)

List of 9
 $ statistic: Named num 2.93e-32
  ..- attr(*, "names")= chr "X-squared"
 $ parameter: Named int 1
  ..- attr(*, "names")= chr "df"
 $ p.value  : num 1
 $ method   : chr "Pearson's Chi-squared test with Yates' continuity correction"
 $ data.name: chr "x1"
 $ observed : num [1:2, 1:2] 1 3 2 4
 $ expected : num [1:2, 1:2] 1.2 2.8 1.8 4.2
 $ residuals: num [1:2, 1:2] -0.1826 0.1195 0.1491 -0.0976
 $ stdres   : num [1:2, 1:2] -0.282 0.282 0.282 -0.282
 - attr(*, "class")= chr "htest"

chisq.test()함수는 “htest” 클래스의 객체를 반환한다. 이 함수를 tryCatch함수의 input으로 사용하면 warning이 생기는 경우에는 해당 함수의 결과가 반환된다.

#  Warning이 생기지 않는다. 하지만 "warnings present"가 반환된다. 
result=tryCatch(chisq.test(x1),warning=function(w) return("warnings present"))
result

[1] "warnings present"

str(result)

 chr "warnings present"

chisq.test(x2)인 경우 tryCatch함수의 input으로 사용해도 warning이 생기지 않기 때문에 해당 함수의 결과가 반환된다.

result=tryCatch(chisq.test(x2),warning=function(w) return("warnings present"))
result


    Pearson's Chi-squared test with Yates' continuity correction

data:  x2
X-squared = 0.30881, df = 1, p-value = 0.5784

연습문제 2

새로운 함수를 하나 만들어 통계의 자동화를 위해 카이제곱검정 결과가 문제가 없으면 카이제곱 검정 결과를 보여주고 warning이 나오면 자동으로 fisher’s test를 시행하는 함수를 만들려면 어떻게 해야 할까?

연습문제 2 정답

x1 <- matrix(c(1,2,
               3,4),nrow=2,byrow=TRUE)
x2 <- matrix(c(11,12,
               15,10),nrow=2,byrow=TRUE)

mychisq.test=function(x){
        result=tryCatch(chisq.test(x),warning=function(w) return("warnings present"))
        if(class(result)!="htest"){
                result=fisher.test(x)
        }
        result
}

mychisq.test(x1)


    Fisher's Exact Test for Count Data

data:  x
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  0.008512238 20.296715040
sample estimates:
odds ratio 
  0.693793

mychisq.test(x2)


    Pearson's Chi-squared test with Yates' continuity correction

data:  x
X-squared = 0.30881, df = 1, p-value = 0.5784

subset의 회귀식 구하기

ggplot2 를 이용하면 산점도 및 회귀선을 쉽게 그릴 수 있다. 예를 들어 radial 데이터에서 동맥경화 정도(NTAV)와 나이(age)의 관계를 산점도로 그리고 선형회귀선을 추가하려면 다음과 같이 할 수 있다.

require(moonBook)
require(ggplot2)
ggplot(data=radial,aes(x=age,y=NTAV))+geom_point()+stat_smooth(method="lm")

이때 산점도의 점색깔을 흡연력(smoking)에 따라 구분하고 싶으면 다음과 하면 된다.

ggplot(data=radial,aes(x=age,y=NTAV,color=smoking))+geom_point()+stat_smooth(method="lm",se=FALSE)

회귀직선은 stat_smooth()를 사용하면 자동으로 그려지지만 회귀직선 식을 알고 싶으면 어떻게 할까? lm()함수를 사용하여 선형회귀에 적합시키고 회귀직선의 기울기와 y절편을 구하려면 다음과 같이 한다.

fit=lm(NTAV~age,data=radial)
fit


Call:
lm(formula = NTAV ~ age, data = radial)

Coefficients:
(Intercept)          age  
    44.3398       0.3848

fit$coef

(Intercept)         age 
 44.3397932   0.3847666

equation=paste0("y=",fit$coef[2],"*x + ",fit$coef[1])
equation

[1] "y=0.384766638728455*x + 44.3397931543193"

equation=paste0("y=",round(fit$coef[2],1),"*x + ",round(fit$coef[1],1))
equation

[1] "y=0.4*x + 44.3"

smoking이라는 그룹변수에 따라 다른 회귀식을 얻으려면 어떻게 해야할까?

절차적 프로그래밍

Group 변수가 하나인 경우

먼저 그룹변수(smoking이라는 열)에 있는 고유값이 몇개인지 알아낸다

(group=unique(radial$smoking))

[1] non-smoker smoker     ex-smoker 
Levels: ex-smoker  non-smoker smoker

데이타를 그룹변수에 따라 subset으로 나눈 후 각각 회귀분석을 해서 회귀식을 얻는다.

equations=c()
for(i in 1:length(group)){
    subdata=radial[radial$smoking==group[i],]
    fit=lm(NTAV~age,data=subdata)
    temp=paste0("y=",round(fit$coef[2],1),"*x + ",round(fit$coef[1],1))
    equations=c(equations,temp)
}
equations

[1] "y=0.3*x + 45.5" "y=1*x + 14.9"   "y=0.3*x + 44.5"

그룹변수가 두개인 경우는? 예를 들어 고혈압 유무와 흡연으로 나누면?

require(ggiraph)
require(plyr)
require(ggiraphExtra)
ggPoints(data=radial,aes(x=age,y=NTAV,color=smoking,facet=HBP),
         method="lm",se=FALSE)

group1=unique(radial$smoking)
group2=unique(radial$HBP)

smoking<-HBP<-equations<-c()
for(i in 1:length(group1)){
    for(j in 1:length(group2)){
        condition=radial$smoking==group1[i] & radial$HBP==group2[j]
        subdata=radial[condition,]
        fit=lm(NTAV~age,data=subdata)
        temp=paste0("y=",round(fit$coef[2],1),"*x + ",round(fit$coef[1],1))
        equations<-c(equations,temp)
        smoking<-c(smoking,group1[i])
        HBP<-c(HBP,group2[j])
    }
}
data.frame(smoking=smoking,HBP=HBP,equation=equations)

  smoking HBP        equation
1       2   0  y=0.4*x + 35.1
2       2   1  y=0.2*x + 56.5
3       3   0    y=0.5*x + 37
4       3   1    y=1*x + 19.3
5       1   0 y=-0.2*x + 72.9
6       1   1      y=0*x + 74

그런데 고혈압 유무와 성별로 나누면?

ggplot(data=radial,aes(x=age,y=NTAV,color=smoking))+
           geom_point()+
           geom_smooth(method="lm",se=FALSE)+
           facet_wrap(~sex)

group1=unique(radial$smoking)
group2=unique(radial$sex)


for(i in 1:length(group1)){
    for(j in 1:length(group2)){
        condition=radial$smoking==group1[i] & radial$sex==group2[j]
        subdata=radial[condition,]
        fit=lm(NTAV~age,data=subdata)
        temp=paste0("y=",round(fit$coef[2],1),"*x + ",round(fit$coef[1],1))
        equations=c(equations,temp)
    }
}

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): 0 (non-NA) cases

이 경우 여성 중 ex-smoker가 없으므로 여성이면서 ex-smoker인 subdata가 없어서 회귀분석할 때 error가 난다.

group변수가 세개, 네개인 경우는? for loop를 3중 4중으로?

ggplot(data=radial,aes(x=age,y=NTAV,color=smoking,fill=sex))+
           geom_point(shape=21)+
           stat_smooth(method="lm",se=FALSE)+
           facet_wrap(~HBP)

plyr패키지를 이용한 함수형 프로그래밍

plyr의 helper function들 중 splat() 함수는 여러 개의 인수를 갖는 함수를 하나의 리스트를 인수로 갖는 함수로 바꾸어 준다. 이 함수는 데이타 프레임을 쪼개지 않고 어떤 함수를 적용시키려 할때 아주 유용하다. 이 경우 데이타 프레임의 열 이름을 함수의 인수로 사용하면 된다.

require(plyr)
mean_hp_per_cyl <- function(hp, cyl, ...) mean(hp / cyl)
splat(mean_hp_per_cyl)(mtcars[1,])

[1] 18.33333

splat(mean_hp_per_cyl)(mtcars)

[1] 23.0013

ddply(mtcars, .(round(wt)), function(df) mean_hp_per_cyl(df$hp, df$cyl))

  round(wt)       V1
1         2 20.09375
2         3 21.75321
3         4 26.40625
4         5 27.08333

ddply(mtcars, .(round(wt)), splat(mean_hp_per_cyl))

  round(wt)       V1
1         2 20.09375
2         3 21.75321
3         4 26.40625
4         5 27.08333

우리의 예제는 회귀식을 구하는 문제이므로 다음과 같이 회귀식을 구하는 함수를 정의한다.

getEquation <- function(NTAV,age,...) {
    fit=lm(NTAV~age)
    equation=paste0("y=",round(fit$coef[2],1),"*x + ",round(fit$coef[1],1))
    equation
}
splat(getEquation)(radial)

[1] "y=0.4*x + 44.3"

radial 데이타를 smoking, sex를 기준으로 subdata로 나누고 회귀식을 구하면 다음과 같다.

ddply(radial,.(smoking,sex),splat(getEquation))

     smoking sex             V1
1 ex-smoker    M y=0.3*x + 44.5
2 non-smoker   F   y=0.7*x + 15
3 non-smoker   M y=0.3*x + 59.4
4 smoker       F y=0.7*x + 11.4
5 smoker       M  y=1.2*x + 7.8

ggiraphExtra 패키지의 ggPoints()함수를 쓰면 각 점의 정보 및 회귀식을 포함하는 interactive plot을 얻을 수 있다.

ggPoints(data=radial,aes(x=age,y=NTAV,color=smoking,facet=sex),
         method="lm",se=FALSE,interactive=TRUE)

Procedural Programming vs Functional Programming(I)

문 건 웅

2017년 2월 18일