Regression-based data analysis

莊皓鈞教授授課筆記整理

Regression is a statistical method to assess relationships among variables.

The objective is to find a model for predicting the dependent (response) variable given one or multiple independent (predictor) variables.Being proficient in regression modeling will enable you to answer many questions by using data formally and provide a solid foundation for business data analytics.

Given limited time, I will try to minimize mathematical foundation of statistical models in the remaining lectures. Without compromising your learning, I will put some technical details aside and focus on applied statistical modeling in R.

5.1 Simple linear regression

For a dataset with n observations, a simple linear regression model is defined by

\[y_i = \beta_0 + \beta_ix_i+\varepsilon_i, i=1,2,...,n\]

where \(\varepsilon_i\) is a random error/term with mean zero and variance \(\sigma^2\).

The word simple means that there is one and only one independent variable in the model.

Simple意味著只有一個獨立變數在模型裡。

As we collect a data set of sample size \(n\), in most cases we estimate \(\beta_0\) (intercept) and \(\beta_1\) (slope) using ordinary least squares (OLS), which basically solves the optimization problem:

\[Min \sum_{i=1}^n(y_i-\hat{y}_i)^2 \\ s.t. \hat{y}_i=\hat{\beta}_0+\hat{\beta}_1x_i\]

使真實值與預測值的誤差最小。

We can rewrite the objective function as: \[Min(y_1 - \hat{\beta}_0+\hat{\beta}_1x_1)^2 + (y_2 - \hat{\beta}_0+\hat{\beta}_1x_2)+...+(y_n - \hat{\beta}_0+\hat{\beta}_1x_n)\]

Using some calculus, we can show that:

\[\hat{\beta_1}=\frac{\sum^n_{i=1}(x_i-\bar{x})(y_i-\bar{y})}{\sum^n_{i=1}(x_i-\bar{x})^2}\]

\[\& \hat{\beta_0}=\bar{y}-\hat{\beta_1}\bar{x}\]

data(); attach(cars)
plot(cars)


這張圖顯示車子distancespeed之間的關係,假設我們想藉由下面的模型來做估計:

\[dist_i = \beta_0speed_i+\varepsilon_i, i=1,2,...,50\]

L1=lm(dist ~ speed)
summary(L1)

Call:
lm(formula = dist ~ speed)

Residuals:
   Min     1Q Median     3Q    Max 
-29.07  -9.53  -2.27   9.21  43.20 

Coefficients:
            Estimate Std. Error t value        Pr(>|t|)    
(Intercept)  -17.579      6.758   -2.60           0.012 *  
speed          3.932      0.416    9.46 0.0000000000015 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 15.4 on 48 degrees of freedom
Multiple R-squared:  0.651, Adjusted R-squared:  0.644 
F-statistic: 89.6 on 1 and 48 DF,  p-value: 0.00000000000149

由上述的線性模型可知

\(\beta_0\)=-17.5791, \(\beta_1\)=3.9324

本質上參數估計包含了uncertainty\(\beta_0\)\(\beta_1\)都有以標準差(statndard error, SE)來量化。標準差有兩種應用:

  1. SE be used to construct the 95% confidence interval for \(\beta_1\):[\(\beta_1\)-1.96・SE(\(\beta_1\)), \(\beta_1\)+1.96・SE(\(\beta_1\))]
  2. SE allow us to perform hypothesis testing
  • H0: \(\beta_1\)=0 (x 跟 y 之間沒有關係)
  • Ha: \(\beta_1\)≠0 (x 跟 y 之間有一些關係)

Essentially, we want to determine whether \(\hat{\beta_1}\) is sufficiently far from zero so that we can be confident that \(\beta_1\) is non-zero. In practice, we compute a t-statistic

\[t=\frac{\hat{\beta_1}-0}{SE(\hat{\beta_1})}\]

which measures the number of standard deviations that \(\hat{\beta_1}\) is away from 0.

The p-value is nothing but probability of observing any value \(>=|t|\), assuming \(\beta_1=0\). A small p-value indicates that the observed association between x and y is probably not due to chance. We reject H0 and decalre a relationship between x and y to exist, if the p-value is small enough.

t-value = estimate / std error
t-value統計量的意義:分子是估計值(心中最好的猜測,讓我的配適度最高)/標準誤差(衡量不確定性)  

- Uncertainty高->範圍大 -> t-value小 -> p-value大
- Uncertainty低->範圍小 -> t-value小 -> p-value小 -> reject H0

In addition to estimated coefficients \(\hat{\beta_0}\) and \(\hat{\beta_1}\), we are interested in estimated residual.

\[\hat{\varepsilon}_i = y_i-\hat{y}_i\]

res = residuals(L1)
mean(res)
[1] 0.00000000000000000919

Residuals allow us to detect potential outliers that we may fail to predict well (DON’T throw outliers away if you have a large sample) and help us assess model specification errors.

殘差使我們能夠檢測出可能無法正確預測的異常值(如果樣本量較大,請不要將異常值丟掉),並幫助我們評估模型規格誤差。

yhat = fitted(L1)
plot(yhat, res)

# Do you notice outliers?

lm() is a very powerful function and we will see it so many times. Below lists functions you can apply to a fitted object such as L1 (Kleiber & Zeileis, 2008)

5.2 Multiple linear regression

Multiple linear regression refers to a model with more than one independent variables

\[y_i = \beta_0 + \beta_ix_{1i}+...+\beta_kx_{ki}\varepsilon_i, i=1,2,...,n\]

Again, what OLS does is to solve the optimization problem

The objective function in the optimization problem is called SSE (sum of squared errors). We can modify it into MSE (mean squared errors)

\[MSE = \frac{1}{n}\sum^n_{i=1}(y_i-\hat{y}_i)^2\]

errorMSE=function(par){
  obj=0
  for(i in 1:nrow(cars)){
    obj=(dist[i]-(par[1]+par[2]*speed[i]))^2+obj
  }
  obj
}

nlminb(c(-5,1), errorMSE)
$par
[1] -17.58   3.93

$objective
[1] 11354

$convergence
[1] 0

$iterations
[1] 6

$evaluations
function gradient 
      10       18 

$message
[1] "relative convergence (4)"
errorMLE=function(par){
  loglik=0
  for(i in 1:nrow(cars)){
    loglik=dnorm(dist[i],(par[1]+par[2]*speed[i]),par[3],log=T)+loglik
  }
  -loglik
}
nlminb(c(-5,1,2), errorMLE, lower=c(-Inf,-Inf,1e-5))
Error in dist[i] : object of type 'closure' is not subsettable

\[E[MSE]=(Model Bias)^2 + Model\quad Variance + Irreducible\quad Error\]

The first term is the squared bias that reflects how close the functional form of the model can get to the true relationship between the predictors and the outcome.

The second term is model variance. Here a famous variance-bais trade-ff arises. Complex models tend to have high variance and over-fit. Simple models, on the other hand, tend to have high bias and under-fit.

💡模型複雜度變高-> variance會變高 -> Bias會變低

As mentioned earlier, we impose assumptions on the error term \(\varepsilon_i\) . Essentially we assume

A direct consequence of this assumption is that the response variable y must be continuous and iid. This assumption is so critical but often ignored by many analysts (including a lot of professors). On some occasions you will have to deal with non-continuous and/or non-iid y observations, which require a set of different techniques.

Let’s analyze the dataset wine.df. March 1990 –Orley Ashenfelter, a Princeton economics professor, claims he can predict wine quality without tasting the wine. Ashenfelter used a linear regression model:

  • Dependent variable: typical price in 1990-1991 wine auctions (approximates quality). It has been transformed into logarithm form: ln(price).
  • Independent variables: Age, Weather, Average growing season temperature (AGST), Harvest rain, Winter rain, & Population of France in the brewing year.

click to download: wine.csv

detach(cars)
#Use the wine.csv file
wine.df = read.csv("wine.csv")
str(wine.df)
'data.frame':   25 obs. of  7 variables:
 $ Year       : int  1952 1953 1955 1957 1958 1959 1960 1961 1962 1963 ...
 $ Price      : num  7.5 8.04 7.69 6.98 6.78 ...
 $ WinterRain : int  600 690 502 420 582 485 763 830 697 608 ...
 $ AGST       : num  17.1 16.7 17.1 16.1 16.4 ...
 $ HarvestRain: int  160 80 130 110 187 187 290 38 52 155 ...
 $ Age        : int  31 30 28 26 25 24 23 22 21 20 ...
 $ FrancePop  : num  43184 43495 44218 45152 45654 ...
summary(wine.df)
      Year          Price        WinterRain       AGST       HarvestRain 
 Min.   :1952   Min.   :6.20   Min.   :376   Min.   :15.0   Min.   : 38  
 1st Qu.:1960   1st Qu.:6.52   1st Qu.:536   1st Qu.:16.2   1st Qu.: 89  
 Median :1966   Median :7.12   Median :600   Median :16.5   Median :130  
 Mean   :1966   Mean   :7.07   Mean   :605   Mean   :16.5   Mean   :149  
 3rd Qu.:1972   3rd Qu.:7.50   3rd Qu.:697   3rd Qu.:17.1   3rd Qu.:187  
 Max.   :1978   Max.   :8.49   Max.   :830   Max.   :17.6   Max.   :292  
      Age         FrancePop    
 Min.   : 5.0   Min.   :43184  
 1st Qu.:11.0   1st Qu.:46584  
 Median :17.0   Median :50255  
 Mean   :17.2   Mean   :49694  
 3rd Qu.:23.0   3rd Qu.:52894  
 Max.   :31.0   Max.   :54602  
#relationship between variables
library(gpairs)
gpairs(wine.df[2:7])

cor(wine.df[2:7])
             Price WinterRain    AGST HarvestRain    Age FrancePop
Price        1.000    0.13665  0.6596     -0.5633  0.448  -0.46686
WinterRain   0.137    1.00000 -0.3211     -0.2754 -0.017  -0.00162
AGST         0.660   -0.32109  1.0000     -0.0645  0.247  -0.25916
HarvestRain -0.563   -0.27544 -0.0645      1.0000 -0.028   0.04126
Age          0.448   -0.01697  0.2469     -0.0280  1.000  -0.99449
FrancePop   -0.467   -0.00162 -0.2592      0.0413 -0.994   1.00000
library(corrplot)
corrplot.mixed(corr=cor(wine.df[2:7], use="complete.obs"),
                 upper="ellipse", tl.pos="lt")

Consider a multiple linear regression model with two predictors:

m1 = lm(Price ~ AGST + HarvestRain, data=wine.df)
summary(m1)

Call:
lm(formula = Price ~ AGST + HarvestRain, data = wine.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.8832 -0.1960  0.0618  0.1538  0.5972 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.20265    1.85443   -1.19  0.24759    
AGST         0.60262    0.11128    5.42 0.000019 ***
HarvestRain -0.00457    0.00101   -4.52  0.00017 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.367 on 22 degrees of freedom
Multiple R-squared:  0.707, Adjusted R-squared:  0.681 
F-statistic: 26.6 on 2 and 22 DF,  p-value: 0.00000135

R-squared \(Rsquared \approx corr(y, \hat{y})^2\) coef範圍:[-1, 1]

yhat_m1 = fitted(m1)
(cor(wine.df$Price, yhat_m1))^2
[1] 0.707

可以得知跟lm model裡的R-squared數字相同,越高表示fitting的越好。

F-statistic

H0: \(\beta_1 =\beta_2=...=\beta_k=0\)

當p-value很大、F-statistic很小,代表你連這個虛無假說都拒絕不了->模型比\(y=\beta_0\)還爛

m2 = lm(Price ~ AGST + HarvestRain + WinterRain + Age + FrancePop, data=wine.df)
summary(m2)

Call:
lm(formula = Price ~ AGST + HarvestRain + WinterRain + Age + 
    FrancePop, data = wine.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.4818 -0.2466 -0.0073  0.2201  0.5199 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.4503989 10.1888839   -0.04  0.96520    
AGST         0.6012239  0.1030203    5.84 0.000013 ***
HarvestRain -0.0039581  0.0008751   -4.52  0.00023 ***
WinterRain   0.0010425  0.0005310    1.96  0.06442 .  
Age          0.0005847  0.0790031    0.01  0.99417    
FrancePop   -0.0000495  0.0001667   -0.30  0.76958    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.302 on 19 degrees of freedom
Multiple R-squared:  0.829, Adjusted R-squared:  0.784 
F-statistic: 18.5 on 5 and 19 DF,  p-value: 0.00000104

m2的模型有5個變數 -> R-squared是0.82,正常來說變數越多,R-squared會越大

m3 = lm(Price ~ AGST + HarvestRain + WinterRain + Age, data=wine.df)
summary(m3)

Call:
lm(formula = Price ~ AGST + HarvestRain + WinterRain + Age, data = wine.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.4547 -0.2427  0.0075  0.1977  0.5364 

Coefficients:
             Estimate Std. Error t value  Pr(>|t|)    
(Intercept) -3.429980   1.765898   -1.94   0.06631 .  
AGST         0.607209   0.098702    6.15 0.0000052 ***
HarvestRain -0.003972   0.000854   -4.65   0.00015 ***
WinterRain   0.001076   0.000507    2.12   0.04669 *  
Age          0.023931   0.008097    2.96   0.00782 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.295 on 20 degrees of freedom
Multiple R-squared:  0.829, Adjusted R-squared:  0.794 
F-statistic: 24.2 on 4 and 20 DF,  p-value: 0.000000204

m3的模型有4個變數 -> 年齡變顯著,係數變大 R-squared一樣是0.82多

其實要看adjusted R-squared,調整過後的m3是0.79,m2是0.78,也可以知道m3比較好。m3也符合我們的預期,酒齡跟價格有關。

💡adjusted R-squared 會懲罰變數的數量(penalized number of variables)

\[\bar{R}^2=1-(1-R^2)\frac{n-1}{n-p-1}\]

我們固定n,當模型中的變數(p)變多,分母變小-> \(R^2\)變大


發現age跟francePop成負相關,因為age越高表示越早釀的酒,越早期的人口數比較少 -> 法國人口是多餘的資訊

plot_summs(m3,scale=TRUE)

Age is in fact a categorical variable. Yet, in prior models, we treat it as a numerical variable. We show how to correctly model a categorical variable (e.g., gender, cities) using dummy variable. The most common way to code dummy variables is to create j-1 dummies for j categories, in which the excluded category serves as the base level (including j variables for j categories makes the regression model NON-identifiable).

We first categorize Age – ageCat=1 if Age<=10; ageCat=2 if 10< Age < =20; ageCat=3 if Age > 20

ageCat=rep(0,length(wine.df$Age))
for (i in c(1:length(wine.df$Age))){
  if (wine.df$Age[i]>20){
    ageCat[i]=3
  } else if (wine.df$Age[i]<=10)
    {
      ageCat[i]=1
  }else{
      ageCat[i]=2
  }
}

wine.new.df=cbind(wine.df,ageCat)
m4 = lm(Price ~ AGST + HarvestRain + WinterRain+as.factor(ageCat), data=wine.new.df)
summary(m4)

Call:
lm(formula = Price ~ AGST + HarvestRain + WinterRain + as.factor(ageCat), 
    data = wine.new.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.3712 -0.2486 -0.0156  0.1999  0.4672 

Coefficients:
                    Estimate Std. Error t value  Pr(>|t|)    
(Intercept)        -3.673549   1.874337   -1.96   0.06484 .  
AGST                0.633222   0.103321    6.13 0.0000068 ***
HarvestRain        -0.004007   0.000854   -4.69   0.00016 ***
WinterRain          0.000955   0.000512    1.87   0.07752 .  
as.factor(ageCat)2  0.311788   0.155597    2.00   0.05956 .  
as.factor(ageCat)3  0.497198   0.158081    3.15   0.00533 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.294 on 19 degrees of freedom
Multiple R-squared:  0.839, Adjusted R-squared:  0.796 
F-statistic: 19.7 on 5 and 19 DF,  p-value: 0.000000625

ageCat2 -> 1 if ageCat=2

ageCat3 -> 1 if ageCat=3

How many dummy variables are included? Ans: 2個

Which category is the base level? Ans: ageCat = 1類

#what as.factor does
age2 = ifelse(wine.new.df$ageCat==2,1,0)
age3 = ifelse(wine.new.df$ageCat==3,1,0)
m5 = lm(Price ~ AGST + HarvestRain + WinterRain+age2+age3, data=wine.new.df)
summary(m5)

Call:
lm(formula = Price ~ AGST + HarvestRain + WinterRain + age2 + 
    age3, data = wine.new.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.3712 -0.2486 -0.0156  0.1999  0.4672 

Coefficients:
             Estimate Std. Error t value  Pr(>|t|)    
(Intercept) -3.673549   1.874337   -1.96   0.06484 .  
AGST         0.633222   0.103321    6.13 0.0000068 ***
HarvestRain -0.004007   0.000854   -4.69   0.00016 ***
WinterRain   0.000955   0.000512    1.87   0.07752 .  
age2         0.311788   0.155597    2.00   0.05956 .  
age3         0.497198   0.158081    3.15   0.00533 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.294 on 19 degrees of freedom
Multiple R-squared:  0.839, Adjusted R-squared:  0.796 
F-statistic: 19.7 on 5 and 19 DF,  p-value: 0.000000625

m5跟m4係數相同

💡回歸不代表任何因果關係!

5.3 Predictive linear regression modeling

Despite its simplicity, the linear regression (\(lm( )\)) is probably the most popular model used to learn associations between x variables and y. The R2 metric reflects how capable x variables are of explaining variation in y ( in-sample variation ). However, R2 increases as the number of variables increase. A model with higher R2 may not be good at out-of-sample prediction.

\[y_i = \beta_0 + \beta_ix_{1i}+...+\beta_kx_{ki}\varepsilon_i, i=n+1, n+2,...,N\]

Before introducing the use of linear regression for prediction, we must distinguish between in- sample explanation and out-of-sample prediction. The former is about explaining or quantifying the average effect of inputs on an outcome, whereas the latter is about predicting the outcome value for new records, given their input values.

Please keep the distinction between in-sample fit and out-of-sample predict in your mind. Below is an example of predictive modeling through regression. The table below lists key variables to be used for analysis. Our task is to help a dealer predict the price he/she will get for used Toyota Corollas. The total number of records in the dataset is 1000 cars (we use the first 1000 cars from the dataset ToyotoCorolla.csv).

我們現在的回歸模型都是樣本內的表現(in-sample variation),真正預測分析在於樣本外(out-of-sample)的表現,模型沒看過的。

Prediction Explaination

Prediction Explaination

click to download: ToyotaCorolla.csv

##Analyze the ToyotaCorolla file
car.df = read.csv("ToyotaCorolla.csv")
# use first 1000 rows of data
car.df = car.df[1:1000, ]
# select variables for regression
selected.var = c(3, 4, 7, 8, 9, 10, 12, 13, 14, 17, 18)

A good predictive model could have a looser fit to the data on which it is based, and a good descriptive model can have low prediction accuracy. Striking a balance between over- and under-fitting is a challenging task for prediction. Luckily, we have k-fold cross validation, an extremely useful technique for us to assess the predictive power of any models (not limited to regression). This technique helps us mitigate over-fitting and has become a standard practice of predictive data analytics. Below shows an example of 5-fold cross validation.

💡k-fold讓你來選擇變數跟設計模型

Two models – car.lma (all variables), car.lmb (interaction) – are fitted. We use a common trick: adding interaction terms between variables.

\[Price_i = \beta_0+\beta_1 Age_i + \beta_2 Automatic_i + \beta_3Age_i * Automatic_i+...+\varepsilon_i, i=1,2,...,n\]

##cross validation: Toyota example
car.df.new = car.df[,c(3, 4, 7, 8, 9, 10, 12, 13, 14, 17, 18)]
names(car.df.new) #examine data
 [1] "Price"         "Age_08_04"     "KM"            "Fuel_Type"     "HP"           
 [6] "Met_Color"     "Automatic"     "CC"            "Doors"         "Quarterly_Tax"
[11] "Weight"       
table(car.df.new$Automatic) #check variable

  0   1 
952  48 
car.lma = lm(Price~.,data=car.df.new )
summary(car.lma)

Call:
lm(formula = Price ~ ., data = car.df.new)

Residuals:
   Min     1Q Median     3Q    Max 
-10690   -832    -42    796   7040 

Coefficients:
                   Estimate  Std. Error t value             Pr(>|t|)    
(Intercept)     -6419.29475  1398.20235   -4.59    0.000004977252140 ***
Age_08_04        -132.41047     3.76506  -35.17 < 0.0000000000000002 ***
KM                 -0.01865     0.00178  -10.48 < 0.0000000000000002 ***
Fuel_TypeDiesel   654.47182   426.53374    1.53                 0.13    
Fuel_TypePetrol  2536.47473   420.27180    6.04    0.000000002241419 ***
HP                 35.06966     3.92364    8.94 < 0.0000000000000002 ***
Met_Color          58.86125    93.98117    0.63                 0.53    
Automatic         239.57173   208.52260    1.15                 0.25    
CC                 -0.02102     0.09449   -0.22                 0.82    
Doors             -69.01729    48.27076   -1.43                 0.15    
Quarterly_Tax      15.91235     2.00734    7.93    0.000000000000006 ***
Weight             17.40015     1.34783   12.91 < 0.0000000000000002 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1360 on 988 degrees of freedom
Multiple R-squared:  0.87,  Adjusted R-squared:  0.868 
F-statistic:  599 on 11 and 988 DF,  p-value: <0.0000000000000002
#interaction
car.lmb=lm(Price~ Age_08_04+Automatic+Age_08_04:Automatic+KM+Fuel_Type+
         HP+Met_Color+CC+Doors+Quarterly_Tax+Weight,data=car.df.new)
summary(car.lmb)

Call:
lm(formula = Price ~ Age_08_04 + Automatic + Age_08_04:Automatic + 
    KM + Fuel_Type + HP + Met_Color + CC + Doors + Quarterly_Tax + 
    Weight, data = car.df.new)

Residuals:
   Min     1Q Median     3Q    Max 
-10676   -829    -24    802   7040 

Coefficients:
                       Estimate  Std. Error t value             Pr(>|t|)    
(Intercept)         -6400.65341  1399.45871   -4.57    0.000005401572628 ***
Age_08_04            -132.12799     3.82401  -34.55 < 0.0000000000000002 ***
Automatic             485.45415   611.22843    0.79                 0.43    
KM                     -0.01869     0.00178  -10.49 < 0.0000000000000002 ***
Fuel_TypeDiesel       660.21663   426.92125    1.55                 0.12    
Fuel_TypePetrol      2528.81664   420.82624    6.01    0.000000002620392 ***
HP                     35.15778     3.93066    8.94 < 0.0000000000000002 ***
Met_Color              58.82933    94.02008    0.63                 0.53    
CC                     -0.02908     0.09638   -0.30                 0.76    
Doors                 -68.80555    48.29326   -1.42                 0.15    
Quarterly_Tax          15.88724     2.00902    7.91    0.000000000000007 ***
Weight                 17.38266     1.34900   12.89 < 0.0000000000000002 ***
Age_08_04:Automatic    -5.07461    11.85731   -0.43                 0.67    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1360 on 987 degrees of freedom
Multiple R-squared:  0.87,  Adjusted R-squared:  0.868 
F-statistic:  549 on 12 and 987 DF,  p-value: <0.0000000000000002

interaction讓模型變得更複雜,考慮他們之間的相互關係。

套件做k-fold

library(DAAG)
# cross validation for Linear Regression
cv.lm(data=car.df.new, car.lma, m=2, printit=F)

cv.lm(data=car.df.new, car.lmb, m=2, printit=F)

💡模型A平均會猜錯的價格比模型B還少,透過這例子要理解兩件事情:

1. 永遠可以創造出很多模型,評估是不是好的模型需要out-of-sample
2. 比較強、複雜的模型不一定會比較準,可能會overfitting

自己幹k-fold

set.seed(5566)  # set seed for reproducing the partition
train.index = sample(c(1:1000), 600, replace=FALSE)  
train.df = car.df[train.index, selected.var]
valid.df = car.df[-train.index, selected.var]
# use lm() to run a linear regression of Price on all 11 predictors in the
# training set. 
# use . after ~ to include all the remaining columns in train.df as predictors.
car.lm = lm(Price ~ ., data = train.df)
#  use options() to ensure numbers are not displayed in scientific notation.
options(scipen = 999)
summary(car.lm)

Call:
lm(formula = Price ~ ., data = train.df)

Residuals:
   Min     1Q Median     3Q    Max 
 -6963   -797    -69    720   6375 

Coefficients:
                    Estimate   Std. Error t value             Pr(>|t|)    
(Intercept)     -21456.14372   2260.50997   -9.49 < 0.0000000000000002 ***
Age_08_04         -116.76022      4.78608  -24.40 < 0.0000000000000002 ***
KM                  -0.01547      0.00207   -7.49     0.00000000000026 ***
Fuel_TypeDiesel   -250.67447    506.14087   -0.50               0.6206    
Fuel_TypePetrol   3717.17617    490.49978    7.58     0.00000000000014 ***
HP                  19.26602      4.67796    4.12     0.00004360787847 ***
Met_Color           14.50101    111.87370    0.13               0.8969    
Automatic           74.04368    259.44406    0.29               0.7754    
CC                  -0.06081      0.08806   -0.69               0.4901    
Doors             -183.91791     60.08247   -3.06               0.0023 ** 
Quarterly_Tax       15.62252      2.31551    6.75     0.00000000003623 ***
Weight              31.69292      2.24725   14.10 < 0.0000000000000002 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1250 on 588 degrees of freedom
Multiple R-squared:  0.889, Adjusted R-squared:  0.887 
F-statistic:  427 on 11 and 588 DF,  p-value: <0.0000000000000002
#evaluate prediction accuracy
library(forecast)
# use predict() to make predictions on a new set. 
car.lm.pred = predict(car.lm, valid.df)
# use accuracy() to compute common accuracy measures.
accuracy(car.lm.pred, valid.df$Price)
           ME RMSE  MAE   MPE MAPE
Test set -107 1696 1086 -1.91 9.64
  • ME(Mean Error): \(y-\hat{y}\) 沒什麼用,可能會高估、低估,平均會被取消掉
  • RMSE(Root Mean Square Error): \((y-\hat{y})^2\)
  • MAE(Mean Absolute Error): \(|y-\hat{y}|\)
  • MPE(Mean Percentage): \(\frac{y-\hat{y}}{y}\)
  • MAPE(Mean Absolute Percentage Error): 比較有用 APE: \(\frac{|y-\hat{y}|}{y}\)

會出問題在於Percentage分母是0,都很有可能掛掉 -> Kaggle比賽常用評估方法“RMSLE”

5.4 Quantile regression

The last piece I want to discuss here is quantile regression, which has gradually become an attractive alternative to the ordinary least squares (OLS) regression.

\[y_i=\beta_0+\beta_1x_{1i}+...\beta_kx_{ki}+\varepsilon_i, i=1,2,...,n\] The OLS linear regression models the response y as a conditional mean of x

\[E(y_i|x_i)=\beta_0+\beta_1x_{1i}+...\beta_kx_{ki}\] Suppose 0<τ<1 indicates the proportion of population with y values below the τth quantile.

The quantile regression models the response y as a conditional quantile of x

\(\tau\) 是第幾百分位數

💡 Quantile regression為什麼要用絕對值不是用平方?
💡 rolling validation
summary(cps_lad)

Call: rq(formula = cps_f, data = CPS1988)

tau: [1] 0.5

Coefficients:
                Value     Std. Error t value   Pr(>|t|) 
(Intercept)       4.24088   0.02190  193.67805   0.00000
experience        0.07744   0.00115   67.50041   0.00000
I(experience^2)  -0.00130   0.00003  -49.97891   0.00000
education         0.09429   0.00140   67.57171   0.00000

預設是\(\tau\) = 0.5,可以找到中位數的線性迴歸=\(4.24+0.077X1-0.001X2+0.009X3\)

The quantile regression is extremely useful when modeling several quantiles of the response y simultaneously. Let’s consider the first (\(\tau=0.25\)) and third(\(\tau=0.75\)) quartiles.

如果縣有興趣的是25%跟75%,直接給rq兩個百分位就可以求出

cps_rq=rq(cps_f, tau=c(0.25, 0.75), data=CPS1988)
summary(cps_rq)

Call: rq(formula = cps_f, tau = c(0.25, 0.75), data = CPS1988)

tau: [1] 0.25

Coefficients:
                Value     Std. Error t value   Pr(>|t|) 
(Intercept)       3.78227   0.02866  131.95189   0.00000
experience        0.09156   0.00152   60.26474   0.00000
I(experience^2)  -0.00164   0.00004  -45.39065   0.00000
education         0.09321   0.00185   50.32520   0.00000

Call: rq(formula = cps_f, tau = c(0.25, 0.75), data = CPS1988)

tau: [1] 0.75

Coefficients:
                Value     Std. Error t value   Pr(>|t|) 
(Intercept)       4.66005   0.02023  230.39734   0.00000
experience        0.06377   0.00097   65.41364   0.00000
I(experience^2)  -0.00099   0.00002  -44.15591   0.00000
education         0.09434   0.00134   70.65855   0.00000

\(\tau=0.75\) \(4.66+0.063X1-0.001X2+0.009X3\) \(\tau=0.5\) \(4.24+0.077X1-0.001X2+0.009X3\) \(\tau=0.25\) \(3,78+0.091X1-0.001X2+0.009X3\)

所以對於高薪水的來說,經驗影響比薪水低的人還少。

A natural question is whether the regression lines are parallel. That is, are the effects of the regressors are uniform across quantiles?

cps_rq25=rq(cps_f, tau=0.25, data=CPS1988)
cps_rq75=rq(cps_f, tau=0.75, data=CPS1988)
anova(cps_rq25, cps_rq75)
Quantile Regression Analysis of Deviance Table

Model: log(wage) ~ experience + I(experience^2) + education
Joint Test of Equality of Slopes: tau in {  0.25 0.75  }

  Df Resid Df F value              Pr(>F)    
1  3    56307     115 <0.0000000000000002 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

ANOVA在檢定25%跟75%的斜率,是否有顯著的不同,依結果來看是不同的。

What is the key finding here?
We can further visualize the results from quantile regression modeling.

cps_rqbig=rq(cps_f, tau=seq(0.05, 0.95, 0.05), data=CPS1988)
cps_rqbigs=summary(cps_rqbig)
18 non-positive fis
plot(cps_rqbigs)

X軸不同的\(\tau\),y軸期望值。紅線是一般回歸的值(直線因為是期望值,每個百分位數都一樣),虛線是信賴區間。

Can you see the drawback of relyiung merely on the OLS regression (lm())? You should have confidence in performing regression analysis for cross-sectional data in R from now on.

自己手刻Quantile Regression

attach(CPS1988)
tau=0.75
errorMAE = function(par) {
  obj=0 # start from 0
  for(i in 1:nrow(CPS1988)) {
    ei=log(wage[i])-(par[1]+par[2]*experience[i]+par[3]*experience[i]^2
                     +par[4]*education[i])
    obj=(tau*ei*(ei>0)+(1-tau)*abs(ei)*(ei<0))+obj
  }
  obj
}
fit=optim(c(0.5,0.5,0.5,0.5),errorMAE)

errorMAE(rep(0.5,4))
[1] 1837907
errorMAE(fit$par)
[1] 4710
coef(cps_rq75)
    (Intercept)      experience I(experience^2)       education 
       4.660054        0.063768       -0.000992        0.094340 
errorMAE(coef(cps_rq75))
(Intercept) 
       4710 

為什麼用optim,因為optim內建搜尋方式不用偏微分,偏微分對function有絕對值不友善,適合我們的目標函式。算出來的結果基本上都與\(\tau=0.5\)算出來的function parameters 相同。

一樣的參數,會有許多組最佳解,一堆變數丟進模型(為度變高),可能會出現許多不同的參數出來的loss相同。

LS0tCnRpdGxlOiAiTGVjdHVyZSA1IgpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazoKICAgIHRoZW1lOiByZWFkYWJsZQogICAgdG9jOiB5ZXMKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgICB0b2M6IHllcwpkYXRlOiAiYHIgZm9ybWF0KFN5cy50aW1lKCksICclWS0lbS0lZCAlSDolTScpYCIKLS0tCiAgCiMjIFJlZ3Jlc3Npb24tYmFzZWQgZGF0YSBhbmFseXNpcwoKX+iOiueak+mInuaVmeaOiOaOiOiqsuethuiomOaVtOeQhl8KCgoqKlJlZ3Jlc3Npb24qKiBpcyBhIHN0YXRpc3RpY2FsIG1ldGhvZCB0byBhc3Nlc3MgcmVsYXRpb25zaGlwcyBhbW9uZyB2YXJpYWJsZXMuCgpUaGUgb2JqZWN0aXZlIGlzIHRvIGZpbmQgYSBtb2RlbCBmb3IgcHJlZGljdGluZyB0aGUgZGVwZW5kZW50IChyZXNwb25zZSkgdmFyaWFibGUgZ2l2ZW4gb25lIG9yIG11bHRpcGxlIGluZGVwZW5kZW50IChwcmVkaWN0b3IpIHZhcmlhYmxlcy5CZWluZyBwcm9maWNpZW50IGluIHJlZ3Jlc3Npb24gbW9kZWxpbmcgd2lsbCBlbmFibGUgeW91IHRvIGFuc3dlciBtYW55IHF1ZXN0aW9ucyBieSB1c2luZyBkYXRhIGZvcm1hbGx5IGFuZCBwcm92aWRlIGEgc29saWQgZm91bmRhdGlvbiBmb3IgYnVzaW5lc3MgZGF0YSBhbmFseXRpY3MuCgpHaXZlbiBsaW1pdGVkIHRpbWUsIEkgd2lsbCB0cnkgdG8gbWluaW1pemUgbWF0aGVtYXRpY2FsIGZvdW5kYXRpb24gb2Ygc3RhdGlzdGljYWwgbW9kZWxzIGluIHRoZSByZW1haW5pbmcgbGVjdHVyZXMuIFdpdGhvdXQgY29tcHJvbWlzaW5nIHlvdXIgbGVhcm5pbmcsIEkgd2lsbCBwdXQgc29tZSB0ZWNobmljYWwgZGV0YWlscyBhc2lkZSBhbmQgZm9jdXMgb24gYXBwbGllZCBzdGF0aXN0aWNhbCBtb2RlbGluZyBpbiBSLgoKIyMjIDUuMSBTaW1wbGUgbGluZWFyIHJlZ3Jlc3Npb24KCkZvciBhIGRhdGFzZXQgd2l0aCBuIG9ic2VydmF0aW9ucywgYSAqc2ltcGxlKiBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbCBpcyBkZWZpbmVkIGJ5CgokJHlfaSA9IFxiZXRhXzAgKyBcYmV0YV9peF9pK1x2YXJlcHNpbG9uX2ksIGk9MSwyLC4uLixuJCQKCndoZXJlICRcdmFyZXBzaWxvbl9pJCBpcyBhIHJhbmRvbSBlcnJvci90ZXJtIHdpdGggbWVhbiB6ZXJvIGFuZCB2YXJpYW5jZSAkXHNpZ21hXjIkLiAKClRoZSB3b3JkICpzaW1wbGUqIG1lYW5zIHRoYXQgdGhlcmUgaXMgb25lIGFuZCBvbmx5IG9uZSBpbmRlcGVuZGVudCB2YXJpYWJsZSBpbiB0aGUgbW9kZWwuCiAgClNpbXBsZeaEj+WRs+iRl+WPquacieS4gOWAi+eNqOeri+iuiuaVuOWcqOaooeWei+ijoeOAggoKQXMgd2UgY29sbGVjdCBhIGRhdGEgc2V0IG9mIHNhbXBsZSBzaXplICRuJCwgaW4gbW9zdCBjYXNlcyB3ZSBlc3RpbWF0ZSAkXGJldGFfMCQgKGludGVyY2VwdCkgYW5kICRcYmV0YV8xJCAoc2xvcGUpIHVzaW5nIG9yZGluYXJ5IGxlYXN0IHNxdWFyZXMgKE9MUyksIHdoaWNoIGJhc2ljYWxseSBzb2x2ZXMgdGhlIG9wdGltaXphdGlvbiBwcm9ibGVtOgoKJCRNaW4gXHN1bV97aT0xfV5uKHlfaS1caGF0e3l9X2kpXjIgXFwgcy50LiBcaGF0e3l9X2k9XGhhdHtcYmV0YX1fMCtcaGF0e1xiZXRhfV8xeF9pJCQKCuS9v+ecn+WvpuWAvOiIh+mgkOa4rOWAvOeahOiqpOW3ruacgOWwj+OAggoKV2UgY2FuIHJld3JpdGUgdGhlIG9iamVjdGl2ZSBmdW5jdGlvbiBhczoKJCRNaW4oeV8xIC0gXGhhdHtcYmV0YX1fMCtcaGF0e1xiZXRhfV8xeF8xKV4yICsgKHlfMiAtIFxoYXR7XGJldGF9XzArXGhhdHtcYmV0YX1fMXhfMikrLi4uKyh5X24gLSBcaGF0e1xiZXRhfV8wK1xoYXR7XGJldGF9XzF4X24pJCQKClVzaW5nIHNvbWUgY2FsY3VsdXMsIHdlIGNhbiBzaG93IHRoYXQ6CgokJFxoYXR7XGJldGFfMX09XGZyYWN7XHN1bV5uX3tpPTF9KHhfaS1cYmFye3h9KSh5X2ktXGJhcnt5fSl9e1xzdW1ebl97aT0xfSh4X2ktXGJhcnt4fSleMn0kJAoKJCRcJiBcaGF0e1xiZXRhXzB9PVxiYXJ7eX0tXGhhdHtcYmV0YV8xfVxiYXJ7eH0kJAoKYGBge3J9CmRhdGEoKTsgYXR0YWNoKGNhcnMpCnBsb3QoY2FycykKYGBgCjwvYnI+CgrpgJnlvLXlnJbpoa/npLrou4rlrZBgZGlzdGFuY2Vg6IiHYHNwZWVkYOS5i+mWk+eahOmXnOS/gu+8jOWBh+ioreaIkeWAkeaDs+iXieeUseS4i+mdoueahOaooeWei+S+huWBmuS8sOioiO+8mgoKJCRkaXN0X2kgPSBcYmV0YV8wc3BlZWRfaStcdmFyZXBzaWxvbl9pLCBpPTEsMiwuLi4sNTAkJApgYGB7cn0KTDE9bG0oZGlzdCB+IHNwZWVkKQpzdW1tYXJ5KEwxKQpgYGAK55Sx5LiK6L+w55qE57ea5oCn5qih5Z6L5Y+v55+lCgokXGJldGFfMCQ9LTE3LjU3OTEsICRcYmV0YV8xJD0zLjkzMjQKCuacrOizquS4iuWPg+aVuOS8sOioiOWMheWQq+S6hmB1bmNlcnRhaW50eWDvvIwkXGJldGFfMCQkXGJldGFfMSTpg73mnInku6XmqJnmupblt64oc3RhdG5kYXJkIGVycm9yLCBTRSnkvobph4/ljJbjgILmqJnmupblt67mnInlhannqK7mh4nnlKjvvJoKCjEuIFNFIGJlIHVzZWQgdG8gY29uc3RydWN0IHRoZSBfOTUlIGNvbmZpZGVuY2UgaW50ZXJ2YWxfIGZvciAkXGJldGFfMSQ6WyRcYmV0YV8xJC0xLjk244O7U0UoJFxiZXRhXzEkKSwgJFxiZXRhXzEkKzEuOTbjg7tTRSgkXGJldGFfMSQpXQoyLiBTRSBhbGxvdyB1cyB0byBwZXJmb3JtIF9oeXBvdGhlc2lzIHRlc3RpbmdfCiAgLSBIMDogJFxiZXRhXzEkPTAgKHgg6LefIHkg5LmL6ZaT5rKS5pyJ6Zec5L+CKQogIC0gSGE6ICRcYmV0YV8xJOKJoDAgKHgg6LefIHkg5LmL6ZaT5pyJ5LiA5Lqb6Zec5L+CKQoKRXNzZW50aWFsbHksIHdlIHdhbnQgdG8gZGV0ZXJtaW5lIHdoZXRoZXIgJFxoYXR7XGJldGFfMX0kIGlzICoqc3VmZmljaWVudGx5IGZhciBmcm9tIHplcm8qKiBzbyB0aGF0IHdlIGNhbiBiZSBjb25maWRlbnQgdGhhdCAkXGJldGFfMSQgaXMgbm9uLXplcm8uIEluIHByYWN0aWNlLCB3ZSBjb21wdXRlIGEgX3Qtc3RhdGlzdGljXwoKJCR0PVxmcmFje1xoYXR7XGJldGFfMX0tMH17U0UoXGhhdHtcYmV0YV8xfSl9JCQKCndoaWNoIG1lYXN1cmVzIHRoZSBudW1iZXIgb2Ygc3RhbmRhcmQgZGV2aWF0aW9ucyB0aGF0ICRcaGF0e1xiZXRhXzF9JCBpcyBhd2F5IGZyb20gMC4KClRoZSBfcC12YWx1ZV8gaXMgbm90aGluZyBidXQgcHJvYmFiaWxpdHkgb2Ygb2JzZXJ2aW5nIGFueSB2YWx1ZSAkPj18dHwkLCBhc3N1bWluZyAkXGJldGFfMT0wJC4gQSBzbWFsbCBwLXZhbHVlIGluZGljYXRlcyB0aGF0IHRoZSBvYnNlcnZlZCBhc3NvY2lhdGlvbiBiZXR3ZWVuIHggYW5kIHkgaXMgcHJvYmFibHkgbm90IGR1ZSB0byBjaGFuY2UuIFdlIHJlamVjdCBIMCBhbmQgZGVjYWxyZSBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHggYW5kIHkgdG8gZXhpc3QsIGlmIHRoZSBwLXZhbHVlIGlzICoqc21hbGwgZW5vdWdoKiouIAoKICAgIHQtdmFsdWUgPSBlc3RpbWF0ZSAvIHN0ZCBlcnJvcgogICAgdC12YWx1Zee1seioiOmHj+eahOaEj+e+qe+8muWIhuWtkOaYr+S8sOioiOWAvO+8iOW/g+S4reacgOWlveeahOeMnOa4rO+8jOiuk+aIkeeahOmFjemBqeW6puacgOmrmO+8iS/mqJnmupboqqTlt67vvIjooaHph4/kuI3norrlrprmgKfvvIkgIAoKICAgIC0gVW5jZXJ0YWludHnpq5gtPuevhOWcjeWkpyAtPiB0LXZhbHVl5bCPIC0+IHAtdmFsdWXlpKcKICAgIC0gVW5jZXJ0YWludHnkvY4tPuevhOWcjeWwjyAtPiB0LXZhbHVl5bCPIC0+IHAtdmFsdWXlsI8gLT4gcmVqZWN0IEgwCgpJbiBhZGRpdGlvbiB0byBlc3RpbWF0ZWQgY29lZmZpY2llbnRzICRcaGF0e1xiZXRhXzB9JCBhbmQgJFxoYXR7XGJldGFfMX0kLCB3ZSBhcmUgaW50ZXJlc3RlZCBpbiBlc3RpbWF0ZWQgcmVzaWR1YWwuCgokJFxoYXR7XHZhcmVwc2lsb259X2kgPSB5X2ktXGhhdHt5fV9pJCQKCmBgYHtyfQpyZXMgPSByZXNpZHVhbHMoTDEpCm1lYW4ocmVzKQpgYGAKClJlc2lkdWFscyBhbGxvdyB1cyB0byBkZXRlY3QgcG90ZW50aWFsIG91dGxpZXJzIHRoYXQgd2UgbWF5IGZhaWwgdG8gcHJlZGljdCB3ZWxsIChET07igJlUIHRocm93IG91dGxpZXJzIGF3YXkgaWYgeW91IGhhdmUgYSBsYXJnZSBzYW1wbGUpIGFuZCBoZWxwIHVzIGFzc2VzcyBtb2RlbCBzcGVjaWZpY2F0aW9uIGVycm9ycy4KCuaumOW3ruS9v+aIkeWAkeiDveWkoOaqoua4rOWHuuWPr+iDveeEoeazleato+eiuumgkOa4rOeahOeVsOW4uOWAvO+8iOWmguaenOaoo+acrOmHj+i8g+Wkp++8jOiri+S4jeimgeWwh+eVsOW4uOWAvOS4n+aOie+8ie+8jOS4puW5q+WKqeaIkeWAkeipleS8sOaooeWei+imj+agvOiqpOW3ruOAggoKYGBge3J9CnloYXQgPSBmaXR0ZWQoTDEpCnBsb3QoeWhhdCwgcmVzKQojIERvIHlvdSBub3RpY2Ugb3V0bGllcnM/CmBgYAoKYGxtKClgIGlzIGEgdmVyeSBwb3dlcmZ1bCBmdW5jdGlvbiBhbmQgd2Ugd2lsbCBzZWUgaXQgc28gbWFueSB0aW1lcy4gQmVsb3cgbGlzdHMgZnVuY3Rpb25zIHlvdSBjYW4gYXBwbHkgdG8gYSBmaXR0ZWQgb2JqZWN0IHN1Y2ggYXMgTDEgWyhLbGVpYmVyICYgWmVpbGVpcywgMjAwOCldKGh0dHBzOi8vZHJpdmUuZ29vZ2xlLmNvbS91Yz9pZD0xWTFEVmxCZ0VXMjlaOUlvTTh0cXI2dy1kZWg1NmtFN3ApCgojIyMgNS4yIE11bHRpcGxlIGxpbmVhciByZWdyZXNzaW9uCgpNdWx0aXBsZSBsaW5lYXIgcmVncmVzc2lvbiByZWZlcnMgdG8gYSBtb2RlbCB3aXRoIG1vcmUgdGhhbiBvbmUgaW5kZXBlbmRlbnQgdmFyaWFibGVzCgokJHlfaSA9IFxiZXRhXzAgKyBcYmV0YV9peF97MWl9Ky4uLitcYmV0YV9reF97a2l9XHZhcmVwc2lsb25faSwgaT0xLDIsLi4uLG4kJAoKQWdhaW4sIHdoYXQgT0xTIGRvZXMgaXMgdG8gc29sdmUgdGhlIG9wdGltaXphdGlvbiBwcm9ibGVtCgoKVGhlIG9iamVjdGl2ZSBmdW5jdGlvbiBpbiB0aGUgb3B0aW1pemF0aW9uIHByb2JsZW0gaXMgY2FsbGVkIFNTRSAoc3VtIG9mIHNxdWFyZWQgZXJyb3JzKS4gV2UgY2FuIG1vZGlmeSBpdCBpbnRvIE1TRSAobWVhbiBzcXVhcmVkIGVycm9ycykKCiQkTVNFID0gXGZyYWN7MX17bn1cc3VtXm5fe2k9MX0oeV9pLVxoYXR7eX1faSleMiQkCmBgYHtyIGVycm9yTVNFfQplcnJvck1TRT1mdW5jdGlvbihwYXIpewogIG9iaj0wCiAgZm9yKGkgaW4gMTpucm93KGNhcnMpKXsKICAgIG9iaj0oZGlzdFtpXS0ocGFyWzFdK3BhclsyXSpzcGVlZFtpXSkpXjIrb2JqCiAgfQogIG9iagp9CgpubG1pbmIoYygtNSwxKSwgZXJyb3JNU0UpCmBgYAoKYGBge3J9CmVycm9yTUxFPWZ1bmN0aW9uKHBhcil7CiAgbG9nbGlrPTAKICBmb3IoaSBpbiAxOm5yb3coY2FycykpewogICAgbG9nbGlrPWRub3JtKGRpc3RbaV0sKHBhclsxXStwYXJbMl0qc3BlZWRbaV0pLHBhclszXSxsb2c9VCkrbG9nbGlrCiAgfQogIC1sb2dsaWsKfQpubG1pbmIoYygtNSwxLDIpLCBlcnJvck1MRSwgbG93ZXI9YygtSW5mLC1JbmYsMWUtNSkpCmBgYAokJEVbTVNFXT0oTW9kZWwgQmlhcyleMiArIE1vZGVsXHF1YWQgVmFyaWFuY2UgKyBJcnJlZHVjaWJsZVxxdWFkIEVycm9yJCQKClRoZSBmaXJzdCB0ZXJtIGlzIHRoZSBzcXVhcmVkIGJpYXMgdGhhdCByZWZsZWN0cyBob3cgY2xvc2UgdGhlIGZ1bmN0aW9uYWwgZm9ybSBvZiB0aGUgbW9kZWwgY2FuIGdldCB0byB0aGUgdHJ1ZSByZWxhdGlvbnNoaXAgYmV0d2VlbiB0aGUgcHJlZGljdG9ycyBhbmQgdGhlIG91dGNvbWUuCgpUaGUgc2Vjb25kIHRlcm0gaXMgbW9kZWwgdmFyaWFuY2UuIEhlcmUgYSBmYW1vdXMgX3ZhcmlhbmNlLWJhaXMgdHJhZGUtZmZfIGFyaXNlcy4gQ29tcGxleCBtb2RlbHMgdGVuZCB0byBoYXZlIGhpZ2ggdmFyaWFuY2UgYW5kIF9vdmVyLWZpdF8uIFNpbXBsZSBtb2RlbHMsIG9uIHRoZSBvdGhlciBoYW5kLCB0ZW5kIHRvIGhhdmUgaGlnaCBiaWFzIGFuZCBfdW5kZXItZml0Xy4KCiAgICDwn5Kh5qih5Z6L6KSH6Zuc5bqm6K6K6auYLT4gdmFyaWFuY2XmnIPororpq5ggLT4gQmlhc+acg+iuiuS9jgoKCkFzIG1lbnRpb25lZCBlYXJsaWVyLCB3ZSBpbXBvc2UgYXNzdW1wdGlvbnMgb24gdGhlIGVycm9yIHRlcm0gJFx2YXJlcHNpbG9uX2kkIC4gRXNzZW50aWFsbHkgd2UgYXNzdW1lCgpBIGRpcmVjdCBjb25zZXF1ZW5jZSBvZiB0aGlzIGFzc3VtcHRpb24gaXMgdGhhdCB0aGUgcmVzcG9uc2UgdmFyaWFibGUgeSBtdXN0IGJlIGNvbnRpbnVvdXMgYW5kIGlpZC4gVGhpcyBhc3N1bXB0aW9uIGlzIHNvIGNyaXRpY2FsIGJ1dCBvZnRlbiBpZ25vcmVkIGJ5IG1hbnkgYW5hbHlzdHMgKGluY2x1ZGluZyBhIGxvdCBvZiBwcm9mZXNzb3JzKS4gT24gc29tZSBvY2Nhc2lvbnMgeW91IHdpbGwgaGF2ZSB0byBkZWFsIHdpdGggbm9uLWNvbnRpbnVvdXMgYW5kL29yIG5vbi1paWQgeSBvYnNlcnZhdGlvbnMsIHdoaWNoIHJlcXVpcmUgYSBzZXQgb2YgZGlmZmVyZW50IHRlY2huaXF1ZXMuCgpMZXTigJlzIGFuYWx5emUgdGhlIGRhdGFzZXQgd2luZS5kZi4gTWFyY2ggMTk5MCDigJNPcmxleSBBc2hlbmZlbHRlciwgYSBQcmluY2V0b24gZWNvbm9taWNzIHByb2Zlc3NvciwgY2xhaW1zIGhlIGNhbiBwcmVkaWN0IHdpbmUgcXVhbGl0eSB3aXRob3V0IHRhc3RpbmcgdGhlIHdpbmUuIEFzaGVuZmVsdGVyIHVzZWQgYSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbDoKCi0gRGVwZW5kZW50IHZhcmlhYmxlOiB0eXBpY2FsIHByaWNlIGluIDE5OTAtMTk5MSB3aW5lIGF1Y3Rpb25zIChhcHByb3hpbWF0ZXMgcXVhbGl0eSkuIEl0IGhhcyBiZWVuIHRyYW5zZm9ybWVkIGludG8gbG9nYXJpdGhtIGZvcm06IGxuKHByaWNlKS4KLSBJbmRlcGVuZGVudCB2YXJpYWJsZXM6CkFnZSwgV2VhdGhlciwgQXZlcmFnZSBncm93aW5nIHNlYXNvbiB0ZW1wZXJhdHVyZSAoQUdTVCksIEhhcnZlc3QgcmFpbiwgV2ludGVyIHJhaW4sICYgUG9wdWxhdGlvbiBvZiBGcmFuY2UgaW4gdGhlIGJyZXdpbmcgeWVhci4KCmNsaWNrIHRvIGRvd25sb2FkOiBbd2luZS5jc3ZdKGh0dHBzOi8vZHJpdmUuZ29vZ2xlLmNvbS91Yz9pZD0xWWFCako5eHBfTTZKeHdMRzNIcEJqUjY4d2Zycjg1UlYvdmlldz91c3A9c2hhcmluZykKCmBgYHtyfQpkZXRhY2goY2FycykKI1VzZSB0aGUgd2luZS5jc3YgZmlsZQp3aW5lLmRmID0gcmVhZC5jc3YoIndpbmUuY3N2IikKc3RyKHdpbmUuZGYpCnN1bW1hcnkod2luZS5kZikKYGBgCgpgYGB7cn0KI3JlbGF0aW9uc2hpcCBiZXR3ZWVuIHZhcmlhYmxlcwpsaWJyYXJ5KGdwYWlycykKZ3BhaXJzKHdpbmUuZGZbMjo3XSkKY29yKHdpbmUuZGZbMjo3XSkKYGBgCgpgYGB7cn0KbGlicmFyeShjb3JycGxvdCkKY29ycnBsb3QubWl4ZWQoY29ycj1jb3Iod2luZS5kZlsyOjddLCB1c2U9ImNvbXBsZXRlLm9icyIpLAogICAgICAgICAgICAgICAgIHVwcGVyPSJlbGxpcHNlIiwgdGwucG9zPSJsdCIpCmBgYAoKQ29uc2lkZXIgYSBtdWx0aXBsZSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbCB3aXRoIHR3byBwcmVkaWN0b3JzOgoKYGBge3J9Cm0xID0gbG0oUHJpY2UgfiBBR1NUICsgSGFydmVzdFJhaW4sIGRhdGE9d2luZS5kZikKc3VtbWFyeShtMSkKYGBgClItc3F1YXJlZCAkUnNxdWFyZWQgXGFwcHJveCBjb3JyKHksIFxoYXR7eX0pXjIkIGNvZWbnr4TlnI06Wy0xLCAxXQoKYGBge3J9CnloYXRfbTEgPSBmaXR0ZWQobTEpCihjb3Iod2luZS5kZiRQcmljZSwgeWhhdF9tMSkpXjIKYGBgCuWPr+S7peW+l+efpei3n2xtIG1vZGVs6KOh55qEUi1zcXVhcmVk5pW45a2X55u45ZCM77yM6LaK6auY6KGo56S6Zml0dGluZ+eahOi2iuWlveOAggoKRi1zdGF0aXN0aWMKCkgwOiAkXGJldGFfMSA9XGJldGFfMj0uLi49XGJldGFfaz0wJAoK55W2cC12YWx1ZeW+iOWkp+OAgUYtc3RhdGlzdGlj5b6I5bCP77yM5Luj6KGo5L2g6YCj6YCZ5YCL6Jmb54Sh5YGH6Kqq6YO95ouS57WV5LiN5LqGLT7mqKHlnovmr5QkeT1cYmV0YV8wJOmChOeImwoKYGBge3J9Cm0yID0gbG0oUHJpY2UgfiBBR1NUICsgSGFydmVzdFJhaW4gKyBXaW50ZXJSYWluICsgQWdlICsgRnJhbmNlUG9wLCBkYXRhPXdpbmUuZGYpCnN1bW1hcnkobTIpCmBgYAoKbTLnmoTmqKHlnovmnIk15YCL6K6K5pW4IC0+IFItc3F1YXJlZOaYrzAuODLvvIzmraPluLjkvoboqqrorormlbjotorlpJrvvIxSLXNxdWFyZWTmnIPotorlpKcKCmBgYHtyfQptMyA9IGxtKFByaWNlIH4gQUdTVCArIEhhcnZlc3RSYWluICsgV2ludGVyUmFpbiArIEFnZSwgZGF0YT13aW5lLmRmKQpzdW1tYXJ5KG0zKQpgYGAKbTPnmoTmqKHlnovmnIk05YCL6K6K5pW4IC0+IOW5tOm9oeiuiumhr+iRl++8jOS/guaVuOiuiuWkpyBSLXNxdWFyZWTkuIDmqKPmmK8wLjgy5aSaCgrlhbblr6bopoHnnIthZGp1c3RlZCBSLXNxdWFyZWTvvIzoqr/mlbTpgY7lvoznmoRtM+aYrzAuNznvvIxtMuaYrzAuNzjvvIzkuZ/lj6/ku6Xnn6XpgZNtM+avlOi8g+WlveOAgm0z5Lmf56ym5ZCI5oiR5YCR55qE6aCQ5pyf77yM6YWS6b2h6Lef5YO55qC85pyJ6Zec44CCCgogICAg8J+SoWFkanVzdGVkIFItc3F1YXJlZCDmnIPmh7LnvbDorormlbjnmoTmlbjph48ocGVuYWxpemVkIG51bWJlciBvZiB2YXJpYWJsZXMpCiAgICAKJCRcYmFye1J9XjI9MS0oMS1SXjIpXGZyYWN7bi0xfXtuLXAtMX0kJAoK5oiR5YCR5Zu65a6abu+8jOeVtuaooeWei+S4reeahOiuiuaVuChwKeiuiuWkmu+8jOWIhuavjeiuiuWwjy0+ICRSXjIk6K6K5aSnCgpgYGB7ciBlY2hvPUZBTFNFfQpncGFpcnMod2luZS5kZlsyOjddKQpgYGAKCjwvYnI+CueZvOePvmFnZei3n2ZyYW5jZVBvcOaIkOiyoOebuOmXnO+8jOWboOeCumFnZei2iumrmOihqOekuui2iuaXqemHgOeahOmFku+8jOi2iuaXqeacn+eahOS6uuWPo+aVuOavlOi8g+WwkSAtPiDms5XlnIvkurrlj6PmmK/lpJrppJjnmoTos4foqIoKCmBgYHtyIHdhcm5pbmc9RkFMU0UsZXJyb3I9RkFMU0V9CnBsb3Rfc3VtbXMobTMsc2NhbGU9VFJVRSkKYGBgCgpBZ2UgaXMgaW4gZmFjdCBhIGNhdGVnb3JpY2FsIHZhcmlhYmxlLiBZZXQsIGluIHByaW9yIG1vZGVscywgd2UgdHJlYXQgaXQgYXMgYSBudW1lcmljYWwgdmFyaWFibGUuIFdlIHNob3cgaG93IHRvIGNvcnJlY3RseSBtb2RlbCBhIGNhdGVnb3JpY2FsIHZhcmlhYmxlIChlLmcuLCBnZW5kZXIsIGNpdGllcykgdXNpbmcgZHVtbXkgdmFyaWFibGUuIFRoZSBtb3N0IGNvbW1vbiB3YXkgdG8gY29kZSBkdW1teSB2YXJpYWJsZXMgaXMgdG8gY3JlYXRlIGotMSBkdW1taWVzIGZvciBqIGNhdGVnb3JpZXMsIGluIHdoaWNoIHRoZSBleGNsdWRlZCBjYXRlZ29yeSBzZXJ2ZXMgYXMgdGhlIGJhc2UgbGV2ZWwgKGluY2x1ZGluZyBqIHZhcmlhYmxlcyBmb3IgaiBjYXRlZ29yaWVzIG1ha2VzIHRoZSByZWdyZXNzaW9uIG1vZGVsIE5PTi1pZGVudGlmaWFibGUpLgoKV2UgZmlyc3QgY2F0ZWdvcml6ZSBBZ2Ug4oCTIGFnZUNhdD0xIGlmIEFnZTw9MTA7IGFnZUNhdD0yIGlmIDEwPCBBZ2UgPCA9MjA7IGFnZUNhdD0zIGlmIEFnZSA+IDIwCmBgYHtyIGR1bW15VmFyaWFibGV9CmFnZUNhdD1yZXAoMCxsZW5ndGgod2luZS5kZiRBZ2UpKQpmb3IgKGkgaW4gYygxOmxlbmd0aCh3aW5lLmRmJEFnZSkpKXsKICBpZiAod2luZS5kZiRBZ2VbaV0+MjApewogICAgYWdlQ2F0W2ldPTMKICB9IGVsc2UgaWYgKHdpbmUuZGYkQWdlW2ldPD0xMCkKICAgIHsKICAgICAgYWdlQ2F0W2ldPTEKICB9ZWxzZXsKICAgICAgYWdlQ2F0W2ldPTIKICB9Cn0KCndpbmUubmV3LmRmPWNiaW5kKHdpbmUuZGYsYWdlQ2F0KQpgYGAKCmBgYHtyfQptNCA9IGxtKFByaWNlIH4gQUdTVCArIEhhcnZlc3RSYWluICsgV2ludGVyUmFpbithcy5mYWN0b3IoYWdlQ2F0KSwgZGF0YT13aW5lLm5ldy5kZikKc3VtbWFyeShtNCkKYGBgCmFnZUNhdDIgLT4gMSBpZiBhZ2VDYXQ9MgoKYWdlQ2F0MyAtPiAxIGlmIGFnZUNhdD0zCgpIb3cgbWFueSBkdW1teSB2YXJpYWJsZXMgYXJlIGluY2x1ZGVkPyBBbnM6IDLlgIsKCldoaWNoIGNhdGVnb3J5IGlzIHRoZSBiYXNlIGxldmVsPyBBbnM6IGFnZUNhdCA9IDHpoZ4KYGBge3J9CiN3aGF0IGFzLmZhY3RvciBkb2VzCmFnZTIgPSBpZmVsc2Uod2luZS5uZXcuZGYkYWdlQ2F0PT0yLDEsMCkKYWdlMyA9IGlmZWxzZSh3aW5lLm5ldy5kZiRhZ2VDYXQ9PTMsMSwwKQptNSA9IGxtKFByaWNlIH4gQUdTVCArIEhhcnZlc3RSYWluICsgV2ludGVyUmFpbithZ2UyK2FnZTMsIGRhdGE9d2luZS5uZXcuZGYpCnN1bW1hcnkobTUpCmBgYAptNei3n2005L+C5pW455u45ZCMCgogICAg8J+SoeWbnuatuOS4jeS7o+ihqOS7u+S9leWboOaenOmXnOS/gu+8gQogICAgCiMjIyA1LjMgUHJlZGljdGl2ZSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbGluZwoKRGVzcGl0ZSBpdHMgc2ltcGxpY2l0eSwgdGhlIGxpbmVhciByZWdyZXNzaW9uICgkbG0oICkkKSBpcyBwcm9iYWJseSB0aGUgbW9zdCBwb3B1bGFyIG1vZGVsIHVzZWQgdG8gbGVhcm4gYXNzb2NpYXRpb25zIGJldHdlZW4geCB2YXJpYWJsZXMgYW5kIHkuIFRoZSBSMiBtZXRyaWMgcmVmbGVjdHMgaG93IGNhcGFibGUgeCB2YXJpYWJsZXMgYXJlIG9mIGV4cGxhaW5pbmcgdmFyaWF0aW9uIGluIHkgKCBfaW4tc2FtcGxlIHZhcmlhdGlvbl8gKS4gSG93ZXZlciwgUjIgaW5jcmVhc2VzIGFzIHRoZSBudW1iZXIgb2YgdmFyaWFibGVzIGluY3JlYXNlLiBBIG1vZGVsIHdpdGggaGlnaGVyIFIyIG1heSBub3QgYmUgZ29vZCBhdCBfb3V0LW9mLXNhbXBsZSBwcmVkaWN0aW9uXy4KCiQkeV9pID0gXGJldGFfMCArIFxiZXRhX2l4X3sxaX0rLi4uK1xiZXRhX2t4X3traX1cdmFyZXBzaWxvbl9pLCBpPW4rMSwgbisyLC4uLixOJCQKCkJlZm9yZSBpbnRyb2R1Y2luZyB0aGUgdXNlIG9mIGxpbmVhciByZWdyZXNzaW9uIGZvciBwcmVkaWN0aW9uLCB3ZSBtdXN0IGRpc3Rpbmd1aXNoIGJldHdlZW4gaW4tIHNhbXBsZSBleHBsYW5hdGlvbiBhbmQgb3V0LW9mLXNhbXBsZSBwcmVkaWN0aW9uLiBUaGUgZm9ybWVyIGlzIGFib3V0IGV4cGxhaW5pbmcgb3IgcXVhbnRpZnlpbmcgdGhlIGF2ZXJhZ2UgZWZmZWN0IG9mIGlucHV0cyBvbiBhbiBvdXRjb21lLCB3aGVyZWFzIHRoZSBsYXR0ZXIgaXMgYWJvdXQgcHJlZGljdGluZyB0aGUgb3V0Y29tZSB2YWx1ZSBmb3IgbmV3IHJlY29yZHMsIGdpdmVuIHRoZWlyIGlucHV0IHZhbHVlcy4KClBsZWFzZSBrZWVwIHRoZSBkaXN0aW5jdGlvbiBiZXR3ZWVuIGluLXNhbXBsZSBmaXQgYW5kIG91dC1vZi1zYW1wbGUgcHJlZGljdCBpbiB5b3VyIG1pbmQuIEJlbG93IGlzIGFuIGV4YW1wbGUgb2YgcHJlZGljdGl2ZSBtb2RlbGluZyB0aHJvdWdoIHJlZ3Jlc3Npb24uIFRoZSB0YWJsZSBiZWxvdyBsaXN0cyBrZXkgdmFyaWFibGVzIHRvIGJlIHVzZWQgZm9yIGFuYWx5c2lzLiBPdXIgdGFzayBpcyB0byBoZWxwIGEgZGVhbGVyIHByZWRpY3QgdGhlIHByaWNlIGhlL3NoZSB3aWxsIGdldCBmb3IgdXNlZCBUb3lvdGEgQ29yb2xsYXMuIFRoZSB0b3RhbCBudW1iZXIgb2YgcmVjb3JkcyBpbiB0aGUgZGF0YXNldCBpcyAxMDAwIGNhcnMgKHdlIHVzZSB0aGUgZmlyc3QgMTAwMCBjYXJzIGZyb20gdGhlIGRhdGFzZXQgVG95b3RvQ29yb2xsYS5jc3YpLgoK5oiR5YCR54++5Zyo55qE5Zue5q245qih5Z6L6YO95piv5qij5pys5YWn55qE6KGo54++KGluLXNhbXBsZSB2YXJpYXRpb24p77yM55yf5q2j6aCQ5ris5YiG5p6Q5Zyo5pa85qij5pys5aSWKG91dC1vZi1zYW1wbGUp55qE6KGo54++77yM5qih5Z6L5rKS55yL6YGO55qE44CCCgohW1ByZWRpY3Rpb24gRXhwbGFpbmF0aW9uXShodHRwczovL2RyaXZlLmdvb2dsZS5jb20vdWM/aWQ9MTQ5S2t1MDhjWXZjUm9VdEtJaktZMUNicEc0dEdzdUZ5KQoKY2xpY2sgdG8gZG93bmxvYWQ6IFtUb3lvdGFDb3JvbGxhLmNzdl0oaHR0cHM6Ly9kcml2ZS5nb29nbGUuY29tL3VjP2lkPTE1NzQwcGV6aXBLTkJ2SV9ZaFJyYXBaOUIxOVFUazNhdSkKCmBgYHtyfQojI0FuYWx5emUgdGhlIFRveW90YUNvcm9sbGEgZmlsZQpjYXIuZGYgPSByZWFkLmNzdigiVG95b3RhQ29yb2xsYS5jc3YiKQojIHVzZSBmaXJzdCAxMDAwIHJvd3Mgb2YgZGF0YQpjYXIuZGYgPSBjYXIuZGZbMToxMDAwLCBdCiMgc2VsZWN0IHZhcmlhYmxlcyBmb3IgcmVncmVzc2lvbgpzZWxlY3RlZC52YXIgPSBjKDMsIDQsIDcsIDgsIDksIDEwLCAxMiwgMTMsIDE0LCAxNywgMTgpCmBgYAoKQSBnb29kIHByZWRpY3RpdmUgbW9kZWwgY291bGQgaGF2ZSBhIGxvb3NlciBmaXQgdG8gdGhlIGRhdGEgb24gd2hpY2ggaXQgaXMgYmFzZWQsIGFuZCBhIGdvb2QgZGVzY3JpcHRpdmUgbW9kZWwgY2FuIGhhdmUgbG93IHByZWRpY3Rpb24gYWNjdXJhY3kuIFN0cmlraW5nIGEgYmFsYW5jZSBiZXR3ZWVuIG92ZXItIGFuZCB1bmRlci1maXR0aW5nIGlzIGEgY2hhbGxlbmdpbmcgdGFzayBmb3IgcHJlZGljdGlvbi4gTHVja2lseSwgd2UgaGF2ZSBrLWZvbGQgY3Jvc3MgdmFsaWRhdGlvbiwgYW4gZXh0cmVtZWx5IHVzZWZ1bCB0ZWNobmlxdWUgZm9yIHVzIHRvIGFzc2VzcyB0aGUgcHJlZGljdGl2ZSBwb3dlciBvZiBhbnkgbW9kZWxzIChub3QgbGltaXRlZCB0byByZWdyZXNzaW9uKS4gVGhpcyB0ZWNobmlxdWUgaGVscHMgdXMgbWl0aWdhdGUgb3Zlci1maXR0aW5nIGFuZCBoYXMgYmVjb21lIGEgc3RhbmRhcmQgcHJhY3RpY2Ugb2YgcHJlZGljdGl2ZSBkYXRhIGFuYWx5dGljcy4gQmVsb3cgc2hvd3MgYW4gZXhhbXBsZSBvZiA1LWZvbGQgY3Jvc3MgdmFsaWRhdGlvbi4KCiAgICDwn5Khay1mb2xk6K6T5L2g5L6G6YG45pOH6K6K5pW46Lef6Kit6KiI5qih5Z6LCgpUd28gbW9kZWxzIOKAkyBgY2FyLmxtYWAgKGFsbCB2YXJpYWJsZXMpLCBgY2FyLmxtYmAgKGludGVyYWN0aW9uKSDigJMgYXJlIGZpdHRlZC4gV2UgdXNlIGEgY29tbW9uIHRyaWNrOiBhZGRpbmcgYGludGVyYWN0aW9uYCB0ZXJtcyBiZXR3ZWVuIHZhcmlhYmxlcy4KCiQkUHJpY2VfaSA9IFxiZXRhXzArXGJldGFfMSBBZ2VfaSArIFxiZXRhXzIgQXV0b21hdGljX2kgKyBcYmV0YV8zQWdlX2kgKiBBdXRvbWF0aWNfaSsuLi4rXHZhcmVwc2lsb25faSwgaT0xLDIsLi4uLG4kJAoKYGBge3J9CiMjY3Jvc3MgdmFsaWRhdGlvbjogVG95b3RhIGV4YW1wbGUKY2FyLmRmLm5ldyA9IGNhci5kZlssYygzLCA0LCA3LCA4LCA5LCAxMCwgMTIsIDEzLCAxNCwgMTcsIDE4KV0KbmFtZXMoY2FyLmRmLm5ldykgI2V4YW1pbmUgZGF0YQp0YWJsZShjYXIuZGYubmV3JEF1dG9tYXRpYykgI2NoZWNrIHZhcmlhYmxlCmNhci5sbWEgPSBsbShQcmljZX4uLGRhdGE9Y2FyLmRmLm5ldyApCnN1bW1hcnkoY2FyLmxtYSkKYGBgCgpgYGB7cn0KI2ludGVyYWN0aW9uCmNhci5sbWI9bG0oUHJpY2V+IEFnZV8wOF8wNCtBdXRvbWF0aWMrQWdlXzA4XzA0OkF1dG9tYXRpYytLTStGdWVsX1R5cGUrCiAgICAgICAgIEhQK01ldF9Db2xvcitDQytEb29ycytRdWFydGVybHlfVGF4K1dlaWdodCxkYXRhPWNhci5kZi5uZXcpCnN1bW1hcnkoY2FyLmxtYikKYGBgCmludGVyYWN0aW9u6K6T5qih5Z6L6K6K5b6X5pu06KSH6Zuc77yM6ICD5oWu5LuW5YCR5LmL6ZaT55qE55u45LqS6Zec5L+C44CCCgojIyMjIOWll+S7tuWBmmstZm9sZApgYGB7ciB3YXJuaW5nPUZBTFNFfQpsaWJyYXJ5KERBQUcpCiMgY3Jvc3MgdmFsaWRhdGlvbiBmb3IgTGluZWFyIFJlZ3Jlc3Npb24KY3YubG0oZGF0YT1jYXIuZGYubmV3LCBjYXIubG1hLCBtPTIsIHByaW50aXQ9RikKY3YubG0oZGF0YT1jYXIuZGYubmV3LCBjYXIubG1iLCBtPTIsIHByaW50aXQ9RikKYGBgCgogICAg8J+SoeaooeWei0HlubPlnYfmnIPnjJzpjK/nmoTlg7nmoLzmr5TmqKHlnotC6YKE5bCR77yM6YCP6YGO6YCZ5L6L5a2Q6KaB55CG6Kej5YWp5Lu25LqL5oOF77yaCgogICAgMS4g5rC46YGg5Y+v5Lul5Ym16YCg5Ye65b6I5aSa5qih5Z6L77yM6KmV5Lyw5piv5LiN5piv5aW955qE5qih5Z6L6ZyA6KaBb3V0LW9mLXNhbXBsZQogICAgMi4g5q+U6LyD5by344CB6KSH6Zuc55qE5qih5Z6L5LiN5LiA5a6a5pyD5q+U6LyD5rqW77yM5Y+v6IO95pyDb3ZlcmZpdHRpbmcKCiMjIyMg6Ieq5bex5bm5ay1mb2xkCmBgYHtyfQpzZXQuc2VlZCg1NTY2KSAgIyBzZXQgc2VlZCBmb3IgcmVwcm9kdWNpbmcgdGhlIHBhcnRpdGlvbgp0cmFpbi5pbmRleCA9IHNhbXBsZShjKDE6MTAwMCksIDYwMCwgcmVwbGFjZT1GQUxTRSkgIAp0cmFpbi5kZiA9IGNhci5kZlt0cmFpbi5pbmRleCwgc2VsZWN0ZWQudmFyXQp2YWxpZC5kZiA9IGNhci5kZlstdHJhaW4uaW5kZXgsIHNlbGVjdGVkLnZhcl0KIyB1c2UgbG0oKSB0byBydW4gYSBsaW5lYXIgcmVncmVzc2lvbiBvZiBQcmljZSBvbiBhbGwgMTEgcHJlZGljdG9ycyBpbiB0aGUKIyB0cmFpbmluZyBzZXQuIAojIHVzZSAuIGFmdGVyIH4gdG8gaW5jbHVkZSBhbGwgdGhlIHJlbWFpbmluZyBjb2x1bW5zIGluIHRyYWluLmRmIGFzIHByZWRpY3RvcnMuCmNhci5sbSA9IGxtKFByaWNlIH4gLiwgZGF0YSA9IHRyYWluLmRmKQojICB1c2Ugb3B0aW9ucygpIHRvIGVuc3VyZSBudW1iZXJzIGFyZSBub3QgZGlzcGxheWVkIGluIHNjaWVudGlmaWMgbm90YXRpb24uCm9wdGlvbnMoc2NpcGVuID0gOTk5KQpzdW1tYXJ5KGNhci5sbSkKI2V2YWx1YXRlIHByZWRpY3Rpb24gYWNjdXJhY3kKYGBgCgpgYGB7cn0KbGlicmFyeShmb3JlY2FzdCkKIyB1c2UgcHJlZGljdCgpIHRvIG1ha2UgcHJlZGljdGlvbnMgb24gYSBuZXcgc2V0LiAKY2FyLmxtLnByZWQgPSBwcmVkaWN0KGNhci5sbSwgdmFsaWQuZGYpCiMgdXNlIGFjY3VyYWN5KCkgdG8gY29tcHV0ZSBjb21tb24gYWNjdXJhY3kgbWVhc3VyZXMuCmFjY3VyYWN5KGNhci5sbS5wcmVkLCB2YWxpZC5kZiRQcmljZSkKYGBgCgotIE1FKE1lYW4gRXJyb3IpOiAkeS1caGF0e3l9JCDmspLku4DpurznlKjvvIzlj6/og73mnIPpq5jkvLDjgIHkvY7kvLDvvIzlubPlnYfmnIPooqvlj5bmtojmjokKLSBSTVNFKFJvb3QgTWVhbiBTcXVhcmUgRXJyb3IpOiAkKHktXGhhdHt5fSleMiQKLSBNQUUoTWVhbiBBYnNvbHV0ZSBFcnJvcik6ICR8eS1caGF0e3l9fCQKLSBNUEUoTWVhbiBQZXJjZW50YWdlKTogJFxmcmFje3ktXGhhdHt5fX17eX0kCi0gTUFQRShNZWFuIEFic29sdXRlIFBlcmNlbnRhZ2UgRXJyb3IpOiDmr5TovIPmnInnlKggQVBFOiAkXGZyYWN7fHktXGhhdHt5fXx9e3l9JAoK5pyD5Ye65ZWP6aGM5Zyo5pa8UGVyY2VudGFnZeWIhuavjeaYrzDvvIzpg73lvojmnInlj6/og73mjpvmjokgLT4gIFtLYWdnbGXmr5Tos73luLjnlKjoqZXkvLDmlrnms5UiUk1TTEUiXShodHRwczovL3d3dy5rYWdnbGUuY29tL2MvYXNocmFlLWVuZXJneS1wcmVkaWN0aW9uL2Rpc2N1c3Npb24vMTEzMDY0KQoKCiMjIyA1LjQgUXVhbnRpbGUgcmVncmVzc2lvbgoKVGhlIGxhc3QgcGllY2UgSSB3YW50IHRvIGRpc2N1c3MgaGVyZSBpcyBxdWFudGlsZSByZWdyZXNzaW9uLCB3aGljaCBoYXMgZ3JhZHVhbGx5IGJlY29tZSBhbiBhdHRyYWN0aXZlIGFsdGVybmF0aXZlIHRvIHRoZSBvcmRpbmFyeSBsZWFzdCBzcXVhcmVzIChPTFMpIHJlZ3Jlc3Npb24uCgokJHlfaT1cYmV0YV8wK1xiZXRhXzF4X3sxaX0rLi4uXGJldGFfa3hfe2tpfStcdmFyZXBzaWxvbl9pLCBpPTEsMiwuLi4sbiQkClRoZSBPTFMgbGluZWFyIHJlZ3Jlc3Npb24gbW9kZWxzIHRoZSByZXNwb25zZSB5IGFzIGEgY29uZGl0aW9uYWwgbWVhbiBvZiB4CgokJEUoeV9pfHhfaSk9XGJldGFfMCtcYmV0YV8xeF97MWl9Ky4uLlxiZXRhX2t4X3traX0kJApTdXBwb3NlIDA8z4Q8MSBpbmRpY2F0ZXMgdGhlIHByb3BvcnRpb24gb2YgcG9wdWxhdGlvbiB3aXRoIHkgdmFsdWVzIGJlbG93IHRoZSDPhHRoIHF1YW50aWxlLgoKVGhlIHF1YW50aWxlIHJlZ3Jlc3Npb24gbW9kZWxzIHRoZSByZXNwb25zZSB5IGFzIGEgY29uZGl0aW9uYWwgcXVhbnRpbGUgb2YgeAoKJFx0YXUkIOaYr+esrOW5vueZvuWIhuS9jeaVuAoKICAgIPCfkqEgUXVhbnRpbGUgcmVncmVzc2lvbueCuuS7gOm6vOimgeeUqOe1leWwjeWAvOS4jeaYr+eUqOW5s+aWue+8nwogICAg8J+SoSByb2xsaW5nIHZhbGlkYXRpb24KCmBgYHtyfQpsaWJyYXJ5KHF1YW50cmVnKQpsaWJyYXJ5KEFFUikKZGF0YSgiQ1BTMTk4OCIpCmNwc19mPWxvZyh3YWdlKSB+IGV4cGVyaWVuY2UgKyBJKGV4cGVyaWVuY2VeMikgKyBlZHVjYXRpb24KY3BzX2xhZD1ycShjcHNfZiwgZGF0YT1DUFMxOTg4KQpjcHNfb2xzPWxtKGNwc19mLCBkYXRhPUNQUzE5ODgpCnN1bW1hcnkoY3BzX2xhZCkKc3VtbWFyeShjcHNfb2xzKQpgYGAK6aCQ6Kit5pivJFx0YXUkID0gMC4177yM5Y+v5Lul5om+5Yiw5Lit5L2N5pW455qE57ea5oCn6L+05q2477ydJDQuMjQrMC4wNzdYMS0wLjAwMVgyKzAuMDA5WDMkCgpUaGUgcXVhbnRpbGUgcmVncmVzc2lvbiBpcyBleHRyZW1lbHkgdXNlZnVsIHdoZW4gbW9kZWxpbmcgc2V2ZXJhbCBxdWFudGlsZXMgb2YgdGhlIHJlc3BvbnNlIHkgc2ltdWx0YW5lb3VzbHkuIExldCdzIGNvbnNpZGVyIHRoZSBmaXJzdCAoJFx0YXU9MC4yNSQpIGFuZCBfdGhpcmRfKCRcdGF1PTAuNzUkKSBxdWFydGlsZXMuIAoK5aaC5p6c57ij5pyJ6IiI6Laj55qE5pivMjUl6LefNzUl77yM55u05o6l57WmcnHlhanlgIvnmb7liIbkvY3lsLHlj6/ku6XmsYLlh7oKYGBge3J9CmNwc19ycT1ycShjcHNfZiwgdGF1PWMoMC4yNSwgMC43NSksIGRhdGE9Q1BTMTk4OCkKc3VtbWFyeShjcHNfcnEpCmBgYAoKJFx0YXU9MC43NSQgICQ0LjY2KzAuMDYzWDEtMC4wMDFYMiswLjAwOVgzJAokXHRhdT0wLjUkICAgJDQuMjQrMC4wNzdYMS0wLjAwMVgyKzAuMDA5WDMkCiRcdGF1PTAuMjUkICAkMyw3OCswLjA5MVgxLTAuMDAxWDIrMC4wMDlYMyQKCuaJgOS7peWwjeaWvOmrmOiWquawtOeahOS+huiqqu+8jOe2k+mpl+W9semfv+avlOiWquawtOS9jueahOS6uumChOWwkeOAggoKQSBuYXR1cmFsIHF1ZXN0aW9uIGlzIHdoZXRoZXIgdGhlIHJlZ3Jlc3Npb24gbGluZXMgYXJlIHBhcmFsbGVsLiBUaGF0IGlzLCBhcmUgdGhlIGVmZmVjdHMgb2YgdGhlIHJlZ3Jlc3NvcnMgYXJlIHVuaWZvcm0gYWNyb3NzIHF1YW50aWxlcz8KCmBgYHtyfQpjcHNfcnEyNT1ycShjcHNfZiwgdGF1PTAuMjUsIGRhdGE9Q1BTMTk4OCkKY3BzX3JxNzU9cnEoY3BzX2YsIHRhdT0wLjc1LCBkYXRhPUNQUzE5ODgpCmFub3ZhKGNwc19ycTI1LCBjcHNfcnE3NSkKYGBgCkFOT1ZB5Zyo5qqi5a6aMjUl6LefNzUl55qE5pac546H77yM5piv5ZCm5pyJ6aGv6JGX55qE5LiN5ZCM77yM5L6d57WQ5p6c5L6G55yL5piv5LiN5ZCM55qE44CCCgpXaGF0IGlzIHRoZSBrZXkgZmluZGluZyBoZXJlPyA8L2JyPgpXZSBjYW4gZnVydGhlciB2aXN1YWxpemUgdGhlIHJlc3VsdHMgZnJvbSBxdWFudGlsZSByZWdyZXNzaW9uIG1vZGVsaW5nLgoKYGBge3J9CmNwc19ycWJpZz1ycShjcHNfZiwgdGF1PXNlcSgwLjA1LCAwLjk1LCAwLjA1KSwgZGF0YT1DUFMxOTg4KQpjcHNfcnFiaWdzPXN1bW1hcnkoY3BzX3JxYmlnKQpwbG90KGNwc19ycWJpZ3MpCmBgYApY6Lu45LiN5ZCM55qEJFx0YXUk77yMeei7uOacn+acm+WAvOOAgue0hee3muaYr+S4gOiIrOWbnuatuOeahOWAvCjnm7Tnt5rlm6DngrrmmK/mnJ/mnJvlgLzvvIzmr4/lgIvnmb7liIbkvY3mlbjpg73kuIDmqKMp77yM6Jmb57ea5piv5L+h6LO05Y2A6ZaT44CCCgpDYW4geW91IHNlZSB0aGUgZHJhd2JhY2sgb2YgcmVseWl1bmcgbWVyZWx5IG9uIHRoZSBPTFMgcmVncmVzc2lvbiAobG0oKSk/IFlvdSBzaG91bGQgaGF2ZSBjb25maWRlbmNlIGluIHBlcmZvcm1pbmcgcmVncmVzc2lvbiBhbmFseXNpcyBmb3IgY3Jvc3Mtc2VjdGlvbmFsIGRhdGEgaW4gUiBmcm9tIG5vdyBvbi4KCuiHquW3seaJi+WIu1F1YW50aWxlIFJlZ3Jlc3Npb24KYGBge3J9CmF0dGFjaChDUFMxOTg4KQp0YXU9MC43NQplcnJvck1BRSA9IGZ1bmN0aW9uKHBhcikgewogIG9iaj0wICMgc3RhcnQgZnJvbSAwCiAgZm9yKGkgaW4gMTpucm93KENQUzE5ODgpKSB7CiAgICBlaT1sb2cod2FnZVtpXSktKHBhclsxXStwYXJbMl0qZXhwZXJpZW5jZVtpXStwYXJbM10qZXhwZXJpZW5jZVtpXV4yCiAgICAgICAgICAgICAgICAgICAgICtwYXJbNF0qZWR1Y2F0aW9uW2ldKQogICAgIyBlaSA9IGRpZmZlcmVuY2UgdHJ1ZSBhbmQgcHJlZGljdGlvbgogICAgb2JqPSh0YXUqZWkqKGVpPjApKygxLXRhdSkqYWJzKGVpKSooZWk8MCkpK29iagogIH0KICBvYmoKfQpgYGAKCmBgYHtyfQpmaXQ9b3B0aW0oYygwLjUsMC41LDAuNSwwLjUpLGVycm9yTUFFKQoKZXJyb3JNQUUocmVwKDAuNSw0KSkKZXJyb3JNQUUoZml0JHBhcikKY29lZihjcHNfcnE3NSkKZXJyb3JNQUUoY29lZihjcHNfcnE3NSkpCmBgYArngrrku4DpurznlKhgb3B0aW1g77yM5Zug54K6YG9wdGltYOWFp+W7uuaQnOWwi+aWueW8j+S4jeeUqOWBj+W+ruWIhu+8jOWBj+W+ruWIhuWwjWZ1bmN0aW9u5pyJ57WV5bCN5YC85LiN5Y+L5ZaE77yM6YGp5ZCI5oiR5YCR55qE55uu5qiZ5Ye95byP44CC566X5Ye65L6G55qE57WQ5p6c5Z+65pys5LiK6YO96IiHJFx0YXU9MC41JOeul+WHuuS+hueahGZ1bmN0aW9uIHBhcmFtZXRlcnMg55u45ZCM44CCCgrkuIDmqKPnmoTlj4PmlbjvvIzmnIPmnInoqLHlpJrntYTmnIDkvbPop6PvvIzkuIDloIborormlbjkuJ/pgLLmqKHlnoso54K65bqm6K6K6auYKe+8jOWPr+iDveacg+WHuuePvuioseWkmuS4jeWQjOeahOWPg+aVuOWHuuS+hueahGxvc3Pnm7jlkIzjgII=