1 회귀분석

1.1 변수들의 관계

회귀분석은 변수들 간의 관계를 설명하는 통계적 모형을 생성하는 기법. 결정적 모형 : 오차를 허용하지 않는 모형 통계적 모형 : 변수들 간의 관계에 오차를 허용하는 모형

독립변수(=입력변수)와 종속변수(=결과변수)가 있다. 독립변수는 영향을 주는 변수 종속변수는 영향을 받는 변수

1.2회귀분석의 종류 및 주요개념

  • 단순 회귀분석 : 독립변수가 하나인 모형으로, 간명성이 우수하다
  • 다중 회귀분석 : 독립변수가 여러개
  • 이항 회귀분석 : 종속변수 2개, 대표적으로 로지스틱 이항 회귀분석
  • 다항 회귀분석 : 종속변수가 3개 이상
  • 로지스틱 다항 회귀분석

  • 연속형 범주인 Y가 2개 이상 있다면 동시에 분석할 수 있는 방법은 없다.(적어도 다중회귀 에서는, 구조 방정식에서는 예외)

  • X가 범주형일 때는 X를 0,1,2 등 과 같이 치환해서 분석이 가능하다.

1.3 선형회귀분석

  • 참값과 예측값의 차로 정의되는 오차를 최소화하는 회귀 방정식을 도출
  • 오차는 잔차라고도 한다.
  • SST(Total Sum of Squares)=SSR(Sum of Squared Residuals)+SSE(Sum of Sqaured Errors)
  • 회귀식의 설명력은 결정계수 (R^2)를 사용한다. 결정계수=SSR/SST
  • 결정계수가 높다고 해서 항상 좋은 것은 아니다. (다중공선성 문제 발생 가능)
  • 잔차의 독립성 : 더빈-왓슨 / 다중공선성 - VIF(2.5보다 크면 위험)

1.3.1 데이터준비

autoparts <- read.csv("autoparts.csv", header = T)
dim(autoparts)
## [1] 34139    11

1.3.2 데이터 확인

이상치 여부

is.na(autoparts)                    
sum(is.na(autoparts))

비어있는 값이 있는지 is.na 와는 반대로 TRUE로 출력

complete.cases(autoparts)         
autoparts[!complete.cases(autoparts), ]
head(autoparts)                                  
head(autoparts, n = 10)                         
head(autoparts, n = 30000)

1.3.3 이상치를 제거하기

  1. box-plot
autoparts1 <- autoparts[autoparts$prod_no=="90784-76001", ]     
boxplot(autoparts1$c_thickness)   

dim(autoparts1)
## [1] 21779    11
summary(autoparts1)
##          prod_no         fix_time         a_speed          b_speed     
##  45231-3B400 :    0   Min.   :  1.00   Min.   :0.4570   Min.   :1.240  
##  45231-3B610 :    0   1st Qu.: 81.00   1st Qu.:0.5980   1st Qu.:1.597  
##  45231-3B641 :    0   Median : 82.10   Median :0.6090   Median :1.640  
##  45231-3B660 :    0   Mean   : 83.14   Mean   :0.6189   Mean   :1.644  
##  45231-P3B750:    0   3rd Qu.: 85.40   3rd Qu.:0.6520   3rd Qu.:1.676  
##  90784-76001 :21779   Max.   :148.60   Max.   :0.8080   Max.   :2.528  
##    separation     s_separation     rate_terms         mpa       
##  Min.   :141.6   Min.   :623.3   Min.   :76.00   Min.   :24.80  
##  1st Qu.:185.9   1st Qu.:651.6   1st Qu.:81.00   1st Qu.:75.30  
##  Median :190.7   Median :710.3   Median :85.00   Median :76.60  
##  Mean   :214.5   Mean   :685.9   Mean   :84.53   Mean   :74.21  
##  3rd Qu.:248.7   3rd Qu.:713.6   3rd Qu.:87.00   3rd Qu.:78.10  
##  Max.   :294.5   Max.   :747.3   Max.   :97.00   Max.   :82.10  
##    load_time     highpressure_time   c_thickness     
##  Min.   : 0.00   Min.   :   37.00   Min.   :   0.30  
##  1st Qu.:18.10   1st Qu.:   60.00   1st Qu.:  21.80  
##  Median :19.20   Median :   67.00   Median :  23.80  
##  Mean   :18.68   Mean   :   96.36   Mean   :  27.44  
##  3rd Qu.:19.20   3rd Qu.:   72.00   3rd Qu.:  25.40  
##  Max.   :22.30   Max.   :65534.00   Max.   :6553.40
boxplot(autoparts1)

  1. 기초 통계량 확인
  • out-lier를 확인하기위해 꼭 해야함
autoparts2 <- autoparts1[autoparts1$c_thickness < 1000, ]            # 이상치 제거 
dim(autoparts2)
## [1] 21767    11
boxplot(autoparts2$c_thickness)

  1. Histogram
hist(autoparts2$c_thickness)

hist(autoparts2$c_thickness,breaks = 50)         # breaks 옵션을 통해 구간 수를 직접 설정할 수도 있다.

  1. corrplot 패키지를 이용한 변수들간의 상관관계 확인
# corrplot 패키지 불러오기 
library("corrplot")
## corrplot 0.84 loaded
autopartcor <- autoparts2[c(-1,-7,-10)]

# corrplot 그리기 
cor=cor(autopartcor)
corrplot(cor,method = "shade", addCoef.col = "black")

1.3.4 모형 만들기

이미 전처리가 되어있는 내장된 women data 사용

women
##    height weight
## 1      58    115
## 2      59    117
## 3      60    120
## 4      61    123
## 5      62    126
## 6      63    129
## 7      64    132
## 8      65    135
## 9      66    139
## 10     67    142
## 11     68    146
## 12     69    150
## 13     70    154
## 14     71    159
## 15     72    164
summary(women)
##      height         weight     
##  Min.   :58.0   Min.   :115.0  
##  1st Qu.:61.5   1st Qu.:124.5  
##  Median :65.0   Median :135.0  
##  Mean   :65.0   Mean   :136.7  
##  3rd Qu.:68.5   3rd Qu.:148.0  
##  Max.   :72.0   Max.   :164.0
m <- lm(weight ~ height, data = women)   
m
## 
## Call:
## lm(formula = weight ~ height, data = women)
## 
## Coefficients:
## (Intercept)       height  
##      -87.52         3.45
summary(m)                                      
## 
## Call:
## lm(formula = weight ~ height, data = women)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7333 -1.1333 -0.3833  0.7417  3.1167 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -87.51667    5.93694  -14.74 1.71e-09 ***
## height        3.45000    0.09114   37.85 1.09e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.525 on 13 degrees of freedom
## Multiple R-squared:  0.991,  Adjusted R-squared:  0.9903 
## F-statistic:  1433 on 1 and 13 DF,  p-value: 1.091e-14
  • 회귀 계수와 intercept 의 p-value가 매우 작기 때문에 유의하다. p-value: 1.091e-14 값은 모델 전체에 대한 유의확률

1.3.5 모델을 이용한 데이터 예측

new.data <- data.frame(height=75)               # 예측할 새로운 데이터명 지정
predict(m, newdata = new.data)                  # predict 함수를 이용해 weight를 예측
##        1 
## 171.2333
new.data <- data.frame(height=c(75,76))
predict(m, newdata = new.data)
##        1        2 
## 171.2333 174.6833

1.3.6 신뢰구간

# 구간 추정을 하는 함수 interval = "confidence" 
new.data <- data.frame(height=c(75,76))
predict(m, newdata = new.data, interval = "confidence")  
##        fit      lwr      upr
## 1 171.2333 169.0885 173.3781
## 2 174.6833 172.3565 177.0102
m <- lm(c_thickness ~ fix_time, data = autoparts2)   # fix_time이 c_thickness에 미치는 영향을 보기 위해 단순회귀 모형
new.data <- data.frame(fix_time = 86.1)   
predict(m,newdata = new.data)
##        1 
## 23.62006

1.4 다중 선형 회귀분석

 autoparts <- read.csv("autoparts.csv", header = TRUE)
 autoparts1 <- autoparts[autoparts$prod_no=="90784-76001", c(2:11)] 
 autoparts2 <- autoparts1[autoparts1$c_thickness < 1000, ]
 head(autoparts2)
##   fix_time a_speed b_speed separation s_separation rate_terms  mpa
## 1     85.5   0.611   1.715      242.0        657.6         95 78.2
## 2     86.2   0.606   1.708      244.7        657.1         95 77.9
## 3     86.0   0.609   1.715      242.7        657.5         95 78.0
## 4     86.1   0.610   1.718      241.9        657.3         95 78.2
## 5     86.1   0.603   1.704      242.5        657.3         95 77.9
## 6     86.3   0.606   1.707      244.5        656.9         95 77.9
##   load_time highpressure_time c_thickness
## 1      18.1                58        24.7
## 2      18.2                58        22.5
## 3      18.1                82        24.1
## 4      18.1                74        25.1
## 5      18.2                56        24.5
## 6      18.0                78        22.9
 head(autoparts2)
##   fix_time a_speed b_speed separation s_separation rate_terms  mpa
## 1     85.5   0.611   1.715      242.0        657.6         95 78.2
## 2     86.2   0.606   1.708      244.7        657.1         95 77.9
## 3     86.0   0.609   1.715      242.7        657.5         95 78.0
## 4     86.1   0.610   1.718      241.9        657.3         95 78.2
## 5     86.1   0.603   1.704      242.5        657.3         95 77.9
## 6     86.3   0.606   1.707      244.5        656.9         95 77.9
##   load_time highpressure_time c_thickness
## 1      18.1                58        24.7
## 2      18.2                58        22.5
## 3      18.1                82        24.1
## 4      18.1                74        25.1
## 5      18.2                56        24.5
## 6      18.0                78        22.9
 summary(autoparts2)
##     fix_time         a_speed          b_speed        separation   
##  Min.   :  1.00   Min.   :0.4570   Min.   :1.240   Min.   :141.6  
##  1st Qu.: 81.00   1st Qu.:0.5980   1st Qu.:1.597   1st Qu.:185.9  
##  Median : 82.10   Median :0.6090   Median :1.640   Median :190.7  
##  Mean   : 83.14   Mean   :0.6189   Mean   :1.644   Mean   :214.5  
##  3rd Qu.: 85.40   3rd Qu.:0.6520   3rd Qu.:1.676   3rd Qu.:248.7  
##  Max.   :148.60   Max.   :0.8080   Max.   :2.528   Max.   :294.5  
##   s_separation     rate_terms         mpa          load_time    
##  Min.   :623.3   Min.   :76.00   Min.   :24.80   Min.   : 0.00  
##  1st Qu.:651.6   1st Qu.:81.00   1st Qu.:75.30   1st Qu.:18.10  
##  Median :710.3   Median :85.00   Median :76.60   Median :19.20  
##  Mean   :685.9   Mean   :84.54   Mean   :74.21   Mean   :18.68  
##  3rd Qu.:713.6   3rd Qu.:87.00   3rd Qu.:78.10   3rd Qu.:19.20  
##  Max.   :747.3   Max.   :97.00   Max.   :82.10   Max.   :22.30  
##  highpressure_time   c_thickness   
##  Min.   :   37.00   Min.   : 0.30  
##  1st Qu.:   60.00   1st Qu.:21.80  
##  Median :   67.00   Median :23.80  
##  Mean   :   96.38   Mean   :23.84  
##  3rd Qu.:   72.00   3rd Qu.:25.40  
##  Max.   :65534.00   Max.   :65.60
 m <- lm(c_thickness ~ fix_time + a_speed, data = autoparts2)  # c_thickness에 두 변수가 어떤 영향을 미치는지 알아보겠다.
 m2 <- lm(c_thickness ~ ., data = autoparts2)                  # c_thickness를 제외한 모든 변수를 다 넣어 보겠다.

1.4.1 잔차분석

par(mfrow=c(2,2))
plot(m)

1번 : 두 영역의 모습이 서로 달라 등분산을 이루지 않음

  • 잔차는 등분산성을 만족하여야 한다.
  • 잔차에 패턴이 있다면 모델 수정이 필요하다.

2번 : Q-Q plot

  • 직선일 때 잔차의 정규성 만족

3번 : 표준화 잔차

  • 기울기가 0인 직선이 이상적

4번 : cook’s distance

  • 왼쪽 가운데 몰려있어야 이상적

1.4.2 예측

#1) 새로운 1개의 데이터로 1개의 종속변수를 예측해 본다
# 데이터의 여러 독립변수를 데이터 프레임으로 구성
new.data <- data.frame(fix_time = 86.1, a_speed = 0.610, b_speed = 1.718,
                       separation = 241.9, s_separation = 657.3, rate_terms = 95, mpa = 78.2,
                       load_time = 18.1, highpressuer_time = 74)
predict(m, newdata = new.data)       # 제품의 두께(C_thickness)출력  
##        1 
## 23.44666
#2) 새로운 복수의 데이터로 복수의 종속변수를 예측해 본다
#데이터의 여러 독립변수를 벡터로 구성한 뒤, 데이터프레임으로 구성
new.data <- data.frame(fix_time=c(86.1,86.1), a_speed = c(0.610, 0.603),
                       b_speed = c(1.718, 1.704), separation = c(241.9, 242.5), 
                       s_separation = c(657.3, 657.3), rate_terms = c(95, 95), 
                       mpa = c(78.2, 77.9),
                       load_time = c(18.1, 18.2), highpressure_time = c(74, 56))
predict(m, newdata = new.data)
##        1        2 
## 23.44666 23.57274
#3) 단순선형 회귀모형과 마찬가지로 신뢰구간을 구할 수 있다.
predict(m, newdata = new.data, interval = "confidence")
##        fit      lwr      upr
## 1 23.44666 23.37760 23.51571
## 2 23.57274 23.50448 23.64101

1.4.3 최적 변수 찾기

  • 모형의 선택
head(swiss)
##              Fertility Agriculture Examination Education Catholic
## Courtelary        80.2        17.0          15        12     9.96
## Delemont          83.1        45.1           6         9    84.84
## Franches-Mnt      92.5        39.7           5         5    93.40
## Moutier           85.8        36.5          12         7    33.77
## Neuveville        76.9        43.5          17        15     5.16
## Porrentruy        76.1        35.3           9         7    90.57
##              Infant.Mortality
## Courtelary               22.2
## Delemont                 22.2
## Franches-Mnt             20.2
## Moutier                  20.3
## Neuveville               20.6
## Porrentruy               26.6
dim(swiss)
## [1] 47  6
m <- lm(Fertility ~ ., data = swiss)
summary(m)
## 
## Call:
## lm(formula = Fertility ~ ., data = swiss)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.2743  -5.2617   0.5032   4.1198  15.3213 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      66.91518   10.70604   6.250 1.91e-07 ***
## Agriculture      -0.17211    0.07030  -2.448  0.01873 *  
## Examination      -0.25801    0.25388  -1.016  0.31546    
## Education        -0.87094    0.18303  -4.758 2.43e-05 ***
## Catholic          0.10412    0.03526   2.953  0.00519 ** 
## Infant.Mortality  1.07705    0.38172   2.822  0.00734 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.165 on 41 degrees of freedom
## Multiple R-squared:  0.7067, Adjusted R-squared:  0.671 
## F-statistic: 19.76 on 5 and 41 DF,  p-value: 5.594e-10
  1. 전진선택법 -AIC가 낮을 수록 개선된 모델
step(m, direction = "forward")                     # 전진선택법 선택
## Start:  AIC=190.69
## Fertility ~ Agriculture + Examination + Education + Catholic + 
##     Infant.Mortality
## 
## Call:
## lm(formula = Fertility ~ Agriculture + Examination + Education + 
##     Catholic + Infant.Mortality, data = swiss)
## 
## Coefficients:
##      (Intercept)       Agriculture       Examination         Education  
##          66.9152           -0.1721           -0.2580           -0.8709  
##         Catholic  Infant.Mortality  
##           0.1041            1.0770
m <- lm(Fertility ~ Agriculture, data = swiss)     # 하나의 변수만 선택
step(m, direction = "forward") 
## Start:  AIC=234.09
## Fertility ~ Agriculture
## 
## Call:
## lm(formula = Fertility ~ Agriculture, data = swiss)
## 
## Coefficients:
## (Intercept)  Agriculture  
##     60.3044       0.1942
  1. 후진제거법
step(m, direction = "backward")                    # 후진제거법 선택
## Start:  AIC=234.09
## Fertility ~ Agriculture
## 
##               Df Sum of Sq    RSS    AIC
## <none>                     6283.1 234.09
## - Agriculture  1    894.84 7178.0 238.34
## 
## Call:
## lm(formula = Fertility ~ Agriculture, data = swiss)
## 
## Coefficients:
## (Intercept)  Agriculture  
##     60.3044       0.1942
m <- lm(Fertility ~ Agriculture + Education + Catholic + Infant.Mortality, data = swiss)     # 하나의 변수만 선택
step(m, direction = "backward") 
## Start:  AIC=189.86
## Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
## 
##                    Df Sum of Sq    RSS    AIC
## <none>                          2158.1 189.86
## - Agriculture       1    264.18 2422.2 193.29
## - Infant.Mortality  1    409.81 2567.9 196.03
## - Catholic          1    956.57 3114.6 205.10
## - Education         1   2249.97 4408.0 221.43
## 
## Call:
## lm(formula = Fertility ~ Agriculture + Education + Catholic + 
##     Infant.Mortality, data = swiss)
## 
## Coefficients:
##      (Intercept)       Agriculture         Education          Catholic  
##          62.1013           -0.1546           -0.9803            0.1247  
## Infant.Mortality  
##           1.0784
  1. both
step(m, direction = "both")                        
## Start:  AIC=189.86
## Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
## 
##                    Df Sum of Sq    RSS    AIC
## <none>                          2158.1 189.86
## - Agriculture       1    264.18 2422.2 193.29
## - Infant.Mortality  1    409.81 2567.9 196.03
## - Catholic          1    956.57 3114.6 205.10
## - Education         1   2249.97 4408.0 221.43
## 
## Call:
## lm(formula = Fertility ~ Agriculture + Education + Catholic + 
##     Infant.Mortality, data = swiss)
## 
## Coefficients:
##      (Intercept)       Agriculture         Education          Catholic  
##          62.1013           -0.1546           -0.9803            0.1247  
## Infant.Mortality  
##           1.0784

연습문제

  • 1
autoparts <- read.csv("autoparts.csv", header = TRUE)
autoparts1 <- autoparts[autoparts$prod_no=="90784-76001", c(2:11)] 
autoparts2 <- autoparts1[autoparts1$c_thickness < 1000, ]

m <- lm(c_thickness ~., data = autoparts2)
step(m, direction = "forward")
## Start:  AIC=25500.35
## c_thickness ~ fix_time + a_speed + b_speed + separation + s_separation + 
##     rate_terms + mpa + load_time + highpressure_time
## 
## Call:
## lm(formula = c_thickness ~ fix_time + a_speed + b_speed + separation + 
##     s_separation + rate_terms + mpa + load_time + highpressure_time, 
##     data = autoparts2)
## 
## Coefficients:
##       (Intercept)           fix_time            a_speed  
##         7.146e+02          6.010e-02         -1.738e+01  
##           b_speed         separation       s_separation  
##         1.952e+00         -7.592e-01         -7.468e-01  
##        rate_terms                mpa          load_time  
##         1.133e-02         -1.520e-01         -1.523e-01  
## highpressure_time  
##        -2.174e-05
step(m, direction = "backward")
## Start:  AIC=25500.35
## c_thickness ~ fix_time + a_speed + b_speed + separation + s_separation + 
##     rate_terms + mpa + load_time + highpressure_time
## 
##                     Df Sum of Sq    RSS   AIC
## <none>                            70175 25500
## - highpressure_time  1        20  70195 25505
## - rate_terms         1        32  70207 25508
## - fix_time           1       410  70585 25625
## - b_speed            1       535  70710 25664
## - load_time          1      1065  71240 25826
## - a_speed            1      5462  75637 27130
## - mpa                1     35055 105230 34318
## - s_separation       1    133330 203505 48674
## - separation         1    140717 210892 49450
## 
## Call:
## lm(formula = c_thickness ~ fix_time + a_speed + b_speed + separation + 
##     s_separation + rate_terms + mpa + load_time + highpressure_time, 
##     data = autoparts2)
## 
## Coefficients:
##       (Intercept)           fix_time            a_speed  
##         7.146e+02          6.010e-02         -1.738e+01  
##           b_speed         separation       s_separation  
##         1.952e+00         -7.592e-01         -7.468e-01  
##        rate_terms                mpa          load_time  
##         1.133e-02         -1.520e-01         -1.523e-01  
## highpressure_time  
##        -2.174e-05
step(m, direction = "both")
## Start:  AIC=25500.35
## c_thickness ~ fix_time + a_speed + b_speed + separation + s_separation + 
##     rate_terms + mpa + load_time + highpressure_time
## 
##                     Df Sum of Sq    RSS   AIC
## <none>                            70175 25500
## - highpressure_time  1        20  70195 25505
## - rate_terms         1        32  70207 25508
## - fix_time           1       410  70585 25625
## - b_speed            1       535  70710 25664
## - load_time          1      1065  71240 25826
## - a_speed            1      5462  75637 27130
## - mpa                1     35055 105230 34318
## - s_separation       1    133330 203505 48674
## - separation         1    140717 210892 49450
## 
## Call:
## lm(formula = c_thickness ~ fix_time + a_speed + b_speed + separation + 
##     s_separation + rate_terms + mpa + load_time + highpressure_time, 
##     data = autoparts2)
## 
## Coefficients:
##       (Intercept)           fix_time            a_speed  
##         7.146e+02          6.010e-02         -1.738e+01  
##           b_speed         separation       s_separation  
##         1.952e+00         -7.592e-01         -7.468e-01  
##        rate_terms                mpa          load_time  
##         1.133e-02         -1.520e-01         -1.523e-01  
## highpressure_time  
##        -2.174e-05
  • 2
A <- data.frame(name = c("A","B","C","D","E","F","G","H","I","J"),
                grade = c(90,75,77,83,65,80,83,70,87,79), 
                IQ = c(140, 125, 120, 135, 105, 123, 132, 115, 128, 131),
                school = c(2,1,1,2,0,3,3,1,4,2),
                game = c(1,3,0,3,4,1,4,1,0,2),
                TV = c(0,3,4,2,4,1,1,3,0,3))
A
##    name grade  IQ school game TV
## 1     A    90 140      2    1  0
## 2     B    75 125      1    3  3
## 3     C    77 120      1    0  4
## 4     D    83 135      2    3  2
## 5     E    65 105      0    4  4
## 6     F    80 123      3    1  1
## 7     G    83 132      3    4  1
## 8     H    70 115      1    1  3
## 9     I    87 128      4    0  0
## 10    J    79 131      2    2  3
lm <- lm(grade ~ ., data = A)
lm
## 
## Call:
## lm(formula = grade ~ ., data = A)
## 
## Coefficients:
## (Intercept)        nameB        nameC        nameD        nameE  
##          90          -15          -13           -7          -25  
##       nameF        nameG        nameH        nameI        nameJ  
##         -10           -7          -20           -3          -11  
##          IQ       school         game           TV  
##          NA           NA           NA           NA
predict(lm, new.data = data.frame(x=5))
##  1  2  3  4  5  6  7  8  9 10 
## 90 75 77 83 65 80 83 70 87 79
lm3 <- lm(grade ~ IQ + school + game + TV, data = A)
    1. 꽃받침의 길이를 예측
x <- iris
head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa
sepalL <- x$Sepal.Length
sepalW <- x$Sepal.Width
petalL <- x$Petal.Length
petalW <- x$Petal.Width

m <- lm(formula = sepalL~sepalW + petalL + petalW, data=x)
m
## 
## Call:
## lm(formula = sepalL ~ sepalW + petalL + petalW, data = x)
## 
## Coefficients:
## (Intercept)       sepalW       petalL       petalW  
##      1.8560       0.6508       0.7091      -0.5565