Packages and setup

We will be using these packages :

# Importing and wrangling data
library(readr)
library(tidyr)
library(dplyr)

# EDA
library(skimr)

# For plotting
library(ggplot2)

# Modelling
library(ggfortify)
library(rstatix)
library(ggpubr)
library(broom)

And we will be using my usual custom ggplot theme :

source("https://raw.githubusercontent.com/davidcarayon/TidyTuesdaySubmissions/master/R/themes.R")

# And adding some IKEA specificities
ikea_yellow <- "#F7E700"
ikea_blue <- "#273FAE"

theme_update(panel.background = element_rect(fill = ikea_blue))

So this week’s data is about IKEA furniture. First, we load the data. As there is only one dataframe this week, we’ll read the data manually :

ikea <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-11-03/ikea.csv') %>% 
  mutate(item_id = as.character(item_id)) # Don't want the ID to be considered as a numeric variable. 

Exploratory Data Analysis

Data structure

Let’s have a look at the data structure :

glimpse(ikea)
## Rows: 3,694
## Columns: 14
## $ X1                <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1…
## $ item_id           <chr> "90420332", "368814", "9333523", "80155205", "30180…
## $ name              <chr> "FREKVENS", "NORDVIKEN", "NORDVIKEN / NORDVIKEN", "…
## $ category          <chr> "Bar furniture", "Bar furniture", "Bar furniture", …
## $ price             <dbl> 265, 995, 2095, 69, 225, 345, 129, 195, 129, 2176, …
## $ old_price         <chr> "No old price", "No old price", "No old price", "No…
## $ sellable_online   <lgl> TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, T…
## $ link              <chr> "https://www.ikea.com/sa/en/p/frekvens-bar-table-in…
## $ other_colors      <chr> "No", "No", "No", "Yes", "No", "No", "No", "No", "N…
## $ short_description <chr> "Bar table, in/outdoor,          51x51 cm", "Bar ta…
## $ designer          <chr> "Nicholai Wiig Hansen", "Francis Cayouette", "Franc…
## $ depth             <dbl> NA, NA, NA, 50, 60, 45, 44, 50, 44, NA, 44, 45, 47,…
## $ height            <dbl> 99, 105, NA, 100, 43, 91, 95, NA, 95, NA, 103, 102,…
## $ width             <dbl> 51, 80, NA, 60, 74, 40, 50, 50, 50, NA, 52, 40, 46,…

We can now have a deeper look into data structure. I especially like the skimr::skim() function for this :

skim(ikea)
Data summary
Name ikea
Number of rows 3694
Number of columns 14
_______________________
Column type frequency:
character 8
logical 1
numeric 5
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
item_id 0 1 5 8 0 2962 0
name 0 1 3 27 0 607 0
category 0 1 4 36 0 17 0
old_price 0 1 4 13 0 365 0
link 0 1 52 163 0 2962 0
other_colors 0 1 2 3 0 2 0
short_description 0 1 3 63 0 1706 0
designer 0 1 3 1261 0 381 0

Variable type: logical

skim_variable n_missing complete_rate mean count
sellable_online 0 1 0.99 TRU: 3666, FAL: 28

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X1 0 1.00 1846.50 1066.51 0 923.25 1846.5 2769.75 3693 ▇▇▇▇▇
price 0 1.00 1078.21 1374.65 3 180.90 544.7 1429.50 9585 ▇▁▁▁▁
depth 1463 0.60 54.38 29.96 1 38.00 47.0 60.00 257 ▇▃▁▁▁
height 988 0.73 101.68 61.10 1 67.00 83.0 124.00 700 ▇▂▁▁▁
width 589 0.84 104.47 71.13 1 60.00 80.0 140.00 420 ▇▅▂▁▁

It seems that each piece of furniture has a single ID and a name. These furniture are described by a qualitative variable, category, and a few quantitative variables such as price, depth, height or width. We already can see some NA’s that will have to be taken into account for data analysis.

We can also note that the total number of rows (3694) is larger than the number of unique furniture id’s (2962). We can try to explore why :

# Let's find the duplicates
duplicates <- ikea %>% 
  group_by(item_id) %>% 
  count() %>% 
  filter(n>1) %>% 
  pull(item_id)

# Filter by duplicates
ikea %>% filter(item_id %in% duplicates) %>% 
  arrange(item_id) %>% 
  select(X1:price) %>% 
  head()
X1 item_id name category price
1769 10091453 TROFAST Children’s furniture 275
1897 10091453 TROFAST Nursery furniture 275
5 10122647 INGOLF Bar furniture 345
1214 10122647 INGOLF Chairs 345
1757 10201673 SUNDVIK Children’s furniture 175
2801 10201673 SUNDVIK Tables & desks 175

The mismatch between item ID and the number of rows is due to multiple categories being assigned to the same item, creating multiple lines for the same object. This could eventually be fixed by aggregating these lines with a paste(collapse = ", ") but we will keep the original structure for now.

Some charts

Speaking about categories, let’s see which categories are the most sold by IKEA :

ikea %>% 
  group_by(category) %>% 
  count() %>% 

ggplot(aes(x = reorder(category,n), y = n)) +
  geom_segment(aes(xend = reorder(category,n), y = 0, yend = n), color = ikea_yellow, size = 2) +
  geom_point(shape = 21, size = 4, fill = ikea_yellow, color = "black") +
  scale_y_continuous(breaks = seq(0,600,50)) +
  coord_flip() +
  labs(y = "# of items", x = "Categories")

Then we can ask ourselves : Which categories are, on average, the most expensive ones ?

# We convert the prices in euros € using an approximate rate of 1SAR ~ 0.22EUR

ikea <- ikea %>% 
  mutate(euro_price = price * 0.22)


mean_prices <- ikea %>%
  group_by(category) %>% 
  mutate(mean_price = mean(euro_price, na.rm=TRUE),
         median_price = median(euro_price,na.rm=TRUE)) %>%
  ungroup()

ggplot(mean_prices,aes(x = reorder(category,median_price), y = euro_price)) +
  stat_boxplot(geom = "errorbar", width = 0.3, color = ikea_yellow) +
  geom_boxplot(fill = ikea_yellow, color = "black", outlier.color = ikea_yellow) +
  coord_flip() +
  scale_y_continuous(labels = scales::dollar_format(suffix = "€", prefix = "")) +
  labs(x = "Categories", y = "Price in euros €")

So, wardrobes, sofas and beds are usually the most expensive pieces of furniture sold by IKEA. But we can clearly see here that there are a lot of outliers !

Linear model

To practice our linear model skills, we’ll ask ourselves a very simple question : Do Sofas price depends on their width ?

Let’s have a first look to this relation with a simple scatterplot :

ikea_sofas <- ikea %>% filter(category == "Sofas & armchairs") 

ggplot(ikea_sofas,aes(x = width, y = euro_price)) +
  geom_point(color = ikea_yellow) +
  scale_y_continuous(labels = scales::dollar_format(suffix = "€", prefix = "")) +
  labs(x = "Sofas width (cm)", y ="Price (€)", title = paste0("IKEA Sofas and armchairs, N = ",nrow(ikea_sofas)," items"))

Checking for the LINE conditions

Our linear model proposition seems appropriate. But first, we need to check the LINE conditions :

  • L : Linear
  • I : Independent
  • N : Normality
  • E : Equality of variances

The linear model we are using is actually an Ordinary Least Squares (OLS) regression, which means that the algorithm used to find the “best fitting line” is the one that minimizes the square of the distance between each point and the line (i.e. the residual values) :

# First run the model
model <- lm(euro_price ~ width, data = ikea_sofas)

augment(model) %>% 
  ggplot(aes(x = width, y = euro_price)) +
  geom_segment(aes(x = width, y = euro_price, yend = .fitted, xend = width), color = "red") +
  geom_point(fill = ikea_yellow, color = "black",shape = 21) +
  geom_line(aes(y = .fitted), color = ikea_yellow) +
  scale_y_continuous(labels = scales::dollar_format(suffix = "€", prefix = "")) +
  labs(x = "Sofas width (cm)", y = "Price (€)")

We can now check our conditions using the autoplot.lm method from the {ggfortify} package :

autoplot(model) + theme_bw()

Having a look at these graphs, we can see that the normality condition seems to be respected (the residuals are globally fitting the QQ-line). The equality of variances also seems respected when considering the residuals vs fitted graph (except a minor “banana” form). We can also see that the 365th data point seems to be an extreme/outlier value. For future analysis, we decide to remove this particular point :

ikea_sofas <- ikea_sofas[-365,]

We can consider that our model doesn’t violate the LINE conditions.

Basic model

We can fit the model using the usual lm function :

model <- lm(euro_price ~ width, data = ikea_sofas)

summary(model)
## 
## Call:
## lm(formula = euro_price ~ width, data = ikea_sofas)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -555.76 -180.02   -6.16  140.54  764.24 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -208.2941    34.0872  -6.111 3.45e-09 ***
## width          4.4375     0.1642  27.026  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 247.6 on 270 degrees of freedom
##   (155 observations deleted due to missingness)
## Multiple R-squared:  0.7301, Adjusted R-squared:  0.7291 
## F-statistic: 730.4 on 1 and 270 DF,  p-value: < 2.2e-16

What we can say about the model output :

  • The intercept values says that a sofa of 0 cm would cost -208.29€, which has no real meaning here.
  • For each additional cm of width , the price increases on average by 4.44€
  • The relation between these 2 variables is statistically significant (P-value < 0.05)
  • The model explains about 73% of the variability
  • The mean error is however quite high, being of 247.6€.

With this model, we could try to predict the price of a new IKEA sofa, based on its width.

For example, we can try this with a new sofa with a width of 123 cm :

# We extract a single line from the dataset
new_data <- data.frame(width = 123)

new_data$predicted_price <- predict(model,newdata = new_data)

new_data
width predicted_price
123 337.5206
ggplot(ikea_sofas,aes(x = width, y = euro_price)) +
  geom_point(color = ikea_yellow) +
  geom_point(data = new_data, aes(x = width, y = predicted_price), size = 4, color = "red") +
  scale_y_continuous(labels = scales::dollar_format(suffix = "€", prefix = "")) +
  labs(x = "Sofas width (cm)", y ="Price (€)")  +
  theme(panel.background = element_rect(fill = ikea_blue))

But maybe we could build a more complex model, also including height and depth.

Multiple regression

model <- lm(euro_price ~ width + height + depth, data = ikea_sofas)
autoplot(model) + theme_bw()

The LINE conditions are fulfilled. We can now explore the model summary.

summary(model)
## 
## Call:
## lm(formula = euro_price ~ width + height + depth, data = ikea_sofas)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -666.97 -185.30    3.42  128.86  796.39 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -352.98551   72.37197  -4.877 2.07e-06 ***
## width          3.83633    0.24153  15.883  < 2e-16 ***
## height        -0.07756    0.99764  -0.078    0.938    
## depth          2.53334    0.50990   4.968 1.36e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 242.3 on 218 degrees of freedom
##   (205 observations deleted due to missingness)
## Multiple R-squared:  0.7281, Adjusted R-squared:  0.7243 
## F-statistic: 194.6 on 3 and 218 DF,  p-value: < 2.2e-16

What we can say :

  • Each slope coefficient is a partial regression coefficient. For example, we can say that, on average, each cm of depth will raise the price by 2.5€ after controlling for width and height.
  • While width and depth seem to have a significant impact on the price, height doesn’t
  • This model still explains about 72% of the data variability
  • The mean error is now 242.3€
  • It seems that width plays the major role for determining the price of sofas

Using the model for prediction

This time, we will be using a train/set datasets cross-validation method to evaluate the predictive power of our model. We will be using 80% of our dataset as a “train” set and the last 20% as the “test” set.

set.seed(123)
train <- ikea_sofas %>% sample_frac(0.8)
test <- anti_join(ikea_sofas,train) 

model <- lm(euro_price ~ width + height + depth, data = train)

We can now try to predict the price of buffets in the “test” dataset :

prediction <- augment(model, newdata = test) 

ggplot(prediction,aes(x = euro_price, y = .fitted)) +
  geom_point(color = ikea_yellow) +
  geom_abline(intercept = 0, slope = 1) +
  labs(x = "Actual price", y = "Predicted price")

We can now evaluate the accuracy of our model by computing the Root Mean Square Error (RMSE) :

RMSE <- prediction %>% 
  summarise(square_error = (euro_price - .fitted)^2) %>% 
  summarise(mean_square_error = mean(square_error,na.rm=TRUE)) %>% 
  summarise(RMSE = sqrt(mean_square_error))

RMSE
RMSE
239.0272

The Root Mean Square Error (RMSE) of the prediction is of 239.03€.

This prediction could be enhanced by the usage of other, more complex regression methods. But we will stop here for now as it is already quite a long analysis.

LS0tCnRpdGxlOiAiV2VlayA0NSA6IElLRUEgZnVybml0dXJlIgpzdWJ0aXRsZSA6ICJUaGlzIHdlZWsncyB0b3BpYyA6IFNpbXBsZSBhbmQgbXVsdGlwbGUgbGluZWFyIHJlZ3Jlc3Npb24iCmF1dGhvcjogIkRhdmlkIENhcmF5b24iCmRhdGU6ICJMYXN0IHVwZGF0ZSA6YHIgU3lzLkRhdGUoKWAiCm91dHB1dDogCiAgaHRtbF9kb2N1bWVudDoKICAgIGNzczogYW5hbHlzaXNfcmVwb3J0LmNzcwogICAgdGhlbWU6IHVuaXRlZAogICAgZGZfcHJpbnQ6IGthYmxlCiAgICBoaWdobGlnaHQ6IGhhZGRvY2sKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OgogICAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICAgIHNtb290aF9zY3JvbGw6IHRydWUKICAgIHRvY19kZXB0aDogMwogICAgbnVtYmVyX3NlY3Rpb25zOiBmYWxzZQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQprbml0OiAoZnVuY3Rpb24oaW5wdXRGaWxlLCBlbmNvZGluZykgewogIHJtYXJrZG93bjo6cmVuZGVyKGlucHV0RmlsZSwgZW5jb2RpbmcgPSBlbmNvZGluZywgb3V0cHV0X2RpciA9ICIuLi9odG1sIikgfSkKLS0tCgpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFLCB3YXJuaW5nID0gRkFMU0UsIG1lc3NhZ2UgPSBGQUxTRSkKYGBgCgojIFBhY2thZ2VzIGFuZCBzZXR1cAoKV2Ugd2lsbCBiZSB1c2luZyB0aGVzZSBwYWNrYWdlcyA6CgpgYGB7ciBsaWJyYXJpZXN9CiMgSW1wb3J0aW5nIGFuZCB3cmFuZ2xpbmcgZGF0YQpsaWJyYXJ5KHJlYWRyKQpsaWJyYXJ5KHRpZHlyKQpsaWJyYXJ5KGRwbHlyKQoKIyBFREEKbGlicmFyeShza2ltcikKCiMgRm9yIHBsb3R0aW5nCmxpYnJhcnkoZ2dwbG90MikKCiMgTW9kZWxsaW5nCmxpYnJhcnkoZ2dmb3J0aWZ5KQpsaWJyYXJ5KHJzdGF0aXgpCmxpYnJhcnkoZ2dwdWJyKQpsaWJyYXJ5KGJyb29tKQpgYGAKCkFuZCB3ZSB3aWxsIGJlIHVzaW5nIG15IHVzdWFsIGN1c3RvbSBnZ3Bsb3QgdGhlbWUgOgoKYGBge3IgdGhlbWUgbG9hZGluZ30Kc291cmNlKCJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vZGF2aWRjYXJheW9uL1RpZHlUdWVzZGF5U3VibWlzc2lvbnMvbWFzdGVyL1IvdGhlbWVzLlIiKQoKIyBBbmQgYWRkaW5nIHNvbWUgSUtFQSBzcGVjaWZpY2l0aWVzCmlrZWFfeWVsbG93IDwtICIjRjdFNzAwIgppa2VhX2JsdWUgPC0gIiMyNzNGQUUiCgp0aGVtZV91cGRhdGUocGFuZWwuYmFja2dyb3VuZCA9IGVsZW1lbnRfcmVjdChmaWxsID0gaWtlYV9ibHVlKSkKCmBgYAoKU28gdGhpcyB3ZWVrJ3MgZGF0YSBpcyBhYm91dCBJS0VBIGZ1cm5pdHVyZS4gRmlyc3QsIHdlIGxvYWQgdGhlIGRhdGEuIEFzIHRoZXJlIGlzIG9ubHkgb25lIGRhdGFmcmFtZSB0aGlzIHdlZWssIHdlJ2xsIHJlYWQgdGhlIGRhdGEgbWFudWFsbHkgOgoKYGBge3IgZGF0YSBsb2FkaW5nfQppa2VhIDwtIHJlYWRyOjpyZWFkX2NzdignaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3Jmb3JkYXRhc2NpZW5jZS90aWR5dHVlc2RheS9tYXN0ZXIvZGF0YS8yMDIwLzIwMjAtMTEtMDMvaWtlYS5jc3YnKSAlPiUgCiAgbXV0YXRlKGl0ZW1faWQgPSBhcy5jaGFyYWN0ZXIoaXRlbV9pZCkpICMgRG9uJ3Qgd2FudCB0aGUgSUQgdG8gYmUgY29uc2lkZXJlZCBhcyBhIG51bWVyaWMgdmFyaWFibGUuIApgYGAKCiMgRXhwbG9yYXRvcnkgRGF0YSBBbmFseXNpcwoKIyMgRGF0YSBzdHJ1Y3R1cmUKCkxldCdzIGhhdmUgYSBsb29rIGF0IHRoZSBkYXRhIHN0cnVjdHVyZSA6CgpgYGB7ciBnbGltcHNlfQpnbGltcHNlKGlrZWEpCmBgYAoKV2UgY2FuIG5vdyBoYXZlIGEgZGVlcGVyIGxvb2sgaW50byBkYXRhIHN0cnVjdHVyZS4gSSBlc3BlY2lhbGx5IGxpa2UgdGhlIGBza2ltcjo6c2tpbSgpYCBmdW5jdGlvbiBmb3IgdGhpcyA6CgpgYGB7ciBza2ltfQpza2ltKGlrZWEpCmBgYAoKSXQgc2VlbXMgdGhhdCBlYWNoIHBpZWNlIG9mIGZ1cm5pdHVyZSBoYXMgYSBzaW5nbGUgSUQgYW5kIGEgbmFtZS4gVGhlc2UgZnVybml0dXJlIGFyZSBkZXNjcmliZWQgYnkgYSBxdWFsaXRhdGl2ZSB2YXJpYWJsZSwgYGNhdGVnb3J5YCwgYW5kIGEgZmV3IHF1YW50aXRhdGl2ZSB2YXJpYWJsZXMgc3VjaCBhcyBgcHJpY2VgLCBgZGVwdGhgLCBgaGVpZ2h0YCBvciBgd2lkdGhgLiBXZSBhbHJlYWR5IGNhbiBzZWUgc29tZSBgTkFgJ3MgdGhhdCB3aWxsIGhhdmUgdG8gYmUgdGFrZW4gaW50byBhY2NvdW50IGZvciBkYXRhIGFuYWx5c2lzLgoKV2UgY2FuIGFsc28gbm90ZSB0aGF0IHRoZSB0b3RhbCBudW1iZXIgb2Ygcm93cyAoYHIgbnJvdyhpa2VhKWApIGlzIGxhcmdlciB0aGFuIHRoZSBudW1iZXIgb2YgdW5pcXVlIGZ1cm5pdHVyZSBpZCdzICgyOTYyKS4gV2UgY2FuIHRyeSB0byBleHBsb3JlIHdoeSA6CgpgYGB7ciBmaW5kaW5nIGR1cGxpY2F0ZXN9CgojIExldCdzIGZpbmQgdGhlIGR1cGxpY2F0ZXMKZHVwbGljYXRlcyA8LSBpa2VhICU+JSAKICBncm91cF9ieShpdGVtX2lkKSAlPiUgCiAgY291bnQoKSAlPiUgCiAgZmlsdGVyKG4+MSkgJT4lIAogIHB1bGwoaXRlbV9pZCkKCiMgRmlsdGVyIGJ5IGR1cGxpY2F0ZXMKaWtlYSAlPiUgZmlsdGVyKGl0ZW1faWQgJWluJSBkdXBsaWNhdGVzKSAlPiUgCiAgYXJyYW5nZShpdGVtX2lkKSAlPiUgCiAgc2VsZWN0KFgxOnByaWNlKSAlPiUgCiAgaGVhZCgpCmBgYAoKVGhlIG1pc21hdGNoIGJldHdlZW4gaXRlbSBJRCBhbmQgdGhlIG51bWJlciBvZiByb3dzIGlzIGR1ZSB0byBtdWx0aXBsZSBjYXRlZ29yaWVzIGJlaW5nIGFzc2lnbmVkIHRvIHRoZSBzYW1lIGl0ZW0sIGNyZWF0aW5nIG11bHRpcGxlIGxpbmVzIGZvciB0aGUgc2FtZSBvYmplY3QuIFRoaXMgY291bGQgZXZlbnR1YWxseSBiZSBmaXhlZCBieSBhZ2dyZWdhdGluZyB0aGVzZSBsaW5lcyB3aXRoIGEgYHBhc3RlKGNvbGxhcHNlID0gIiwgIilgIGJ1dCB3ZSB3aWxsIGtlZXAgdGhlIG9yaWdpbmFsIHN0cnVjdHVyZSBmb3Igbm93LgoKIyMgU29tZSBjaGFydHMKClNwZWFraW5nIGFib3V0IGNhdGVnb3JpZXMsIGxldCdzIHNlZSB3aGljaCBjYXRlZ29yaWVzIGFyZSB0aGUgbW9zdCBzb2xkIGJ5IElLRUEgOgoKYGBge3IgY291bnRpbmcgY2F0ZWdvcmllcywgZmlnLndpZHRoID0gOX0KCmlrZWEgJT4lIAogIGdyb3VwX2J5KGNhdGVnb3J5KSAlPiUgCiAgY291bnQoKSAlPiUgCgpnZ3Bsb3QoYWVzKHggPSByZW9yZGVyKGNhdGVnb3J5LG4pLCB5ID0gbikpICsKICBnZW9tX3NlZ21lbnQoYWVzKHhlbmQgPSByZW9yZGVyKGNhdGVnb3J5LG4pLCB5ID0gMCwgeWVuZCA9IG4pLCBjb2xvciA9IGlrZWFfeWVsbG93LCBzaXplID0gMikgKwogIGdlb21fcG9pbnQoc2hhcGUgPSAyMSwgc2l6ZSA9IDQsIGZpbGwgPSBpa2VhX3llbGxvdywgY29sb3IgPSAiYmxhY2siKSArCiAgc2NhbGVfeV9jb250aW51b3VzKGJyZWFrcyA9IHNlcSgwLDYwMCw1MCkpICsKICBjb29yZF9mbGlwKCkgKwogIGxhYnMoeSA9ICIjIG9mIGl0ZW1zIiwgeCA9ICJDYXRlZ29yaWVzIikKYGBgCgpUaGVuIHdlIGNhbiBhc2sgb3Vyc2VsdmVzIDogV2hpY2ggY2F0ZWdvcmllcyBhcmUsIG9uIGF2ZXJhZ2UsIHRoZSBtb3N0IGV4cGVuc2l2ZSBvbmVzID8KCmBgYHtyIGJveHBsb3QsIGZpZy53aWR0aCA9IDl9CgojIFdlIGNvbnZlcnQgdGhlIHByaWNlcyBpbiBldXJvcyDigqwgdXNpbmcgYW4gYXBwcm94aW1hdGUgcmF0ZSBvZiAxU0FSIH4gMC4yMkVVUgoKaWtlYSA8LSBpa2VhICU+JSAKICBtdXRhdGUoZXVyb19wcmljZSA9IHByaWNlICogMC4yMikKCgptZWFuX3ByaWNlcyA8LSBpa2VhICU+JQogIGdyb3VwX2J5KGNhdGVnb3J5KSAlPiUgCiAgbXV0YXRlKG1lYW5fcHJpY2UgPSBtZWFuKGV1cm9fcHJpY2UsIG5hLnJtPVRSVUUpLAogICAgICAgICBtZWRpYW5fcHJpY2UgPSBtZWRpYW4oZXVyb19wcmljZSxuYS5ybT1UUlVFKSkgJT4lCiAgdW5ncm91cCgpCgpnZ3Bsb3QobWVhbl9wcmljZXMsYWVzKHggPSByZW9yZGVyKGNhdGVnb3J5LG1lZGlhbl9wcmljZSksIHkgPSBldXJvX3ByaWNlKSkgKwogIHN0YXRfYm94cGxvdChnZW9tID0gImVycm9yYmFyIiwgd2lkdGggPSAwLjMsIGNvbG9yID0gaWtlYV95ZWxsb3cpICsKICBnZW9tX2JveHBsb3QoZmlsbCA9IGlrZWFfeWVsbG93LCBjb2xvciA9ICJibGFjayIsIG91dGxpZXIuY29sb3IgPSBpa2VhX3llbGxvdykgKwogIGNvb3JkX2ZsaXAoKSArCiAgc2NhbGVfeV9jb250aW51b3VzKGxhYmVscyA9IHNjYWxlczo6ZG9sbGFyX2Zvcm1hdChzdWZmaXggPSAi4oKsIiwgcHJlZml4ID0gIiIpKSArCiAgbGFicyh4ID0gIkNhdGVnb3JpZXMiLCB5ID0gIlByaWNlIGluIGV1cm9zIOKCrCIpCgpgYGAKClNvLCB3YXJkcm9iZXMsIHNvZmFzIGFuZCBiZWRzIGFyZSAqdXN1YWxseSogdGhlIG1vc3QgZXhwZW5zaXZlIHBpZWNlcyBvZiBmdXJuaXR1cmUgc29sZCBieSBJS0VBLiBCdXQgd2UgY2FuIGNsZWFybHkgc2VlIGhlcmUgdGhhdCB0aGVyZSBhcmUgYSBsb3Qgb2Ygb3V0bGllcnMgIQoKIyBMaW5lYXIgbW9kZWwKClRvIHByYWN0aWNlIG91ciBsaW5lYXIgbW9kZWwgc2tpbGxzLCB3ZSdsbCBhc2sgb3Vyc2VsdmVzIGEgdmVyeSBzaW1wbGUgcXVlc3Rpb24gOiBEbyBTb2ZhcyBwcmljZSBkZXBlbmRzIG9uIHRoZWlyIHdpZHRoID8KCkxldCdzIGhhdmUgYSBmaXJzdCBsb29rIHRvIHRoaXMgcmVsYXRpb24gd2l0aCBhIHNpbXBsZSBzY2F0dGVycGxvdCA6CgpgYGB7ciBzY2F0dGVycGxvdH0KaWtlYV9zb2ZhcyA8LSBpa2VhICU+JSBmaWx0ZXIoY2F0ZWdvcnkgPT0gIlNvZmFzICYgYXJtY2hhaXJzIikgCgpnZ3Bsb3QoaWtlYV9zb2ZhcyxhZXMoeCA9IHdpZHRoLCB5ID0gZXVyb19wcmljZSkpICsKICBnZW9tX3BvaW50KGNvbG9yID0gaWtlYV95ZWxsb3cpICsKICBzY2FsZV95X2NvbnRpbnVvdXMobGFiZWxzID0gc2NhbGVzOjpkb2xsYXJfZm9ybWF0KHN1ZmZpeCA9ICLigqwiLCBwcmVmaXggPSAiIikpICsKICBsYWJzKHggPSAiU29mYXMgd2lkdGggKGNtKSIsIHkgPSJQcmljZSAo4oKsKSIsIHRpdGxlID0gcGFzdGUwKCJJS0VBIFNvZmFzIGFuZCBhcm1jaGFpcnMsIE4gPSAiLG5yb3coaWtlYV9zb2ZhcyksIiBpdGVtcyIpKQpgYGAKCiMjIENoZWNraW5nIGZvciB0aGUgTElORSBjb25kaXRpb25zCgpPdXIgbGluZWFyIG1vZGVsIHByb3Bvc2l0aW9uIHNlZW1zIGFwcHJvcHJpYXRlLiBCdXQgZmlyc3QsIHdlIG5lZWQgdG8gY2hlY2sgdGhlIExJTkUgY29uZGl0aW9ucyA6CgotICAgTCA6IExpbmVhcgotICAgSSA6IEluZGVwZW5kZW50Ci0gICBOIDogTm9ybWFsaXR5Ci0gICBFIDogRXF1YWxpdHkgb2YgdmFyaWFuY2VzCgpUaGUgbGluZWFyIG1vZGVsIHdlIGFyZSB1c2luZyBpcyBhY3R1YWxseSBhbiBPcmRpbmFyeSBMZWFzdCBTcXVhcmVzIChPTFMpIHJlZ3Jlc3Npb24sIHdoaWNoIG1lYW5zIHRoYXQgdGhlIGFsZ29yaXRobSB1c2VkIHRvIGZpbmQgdGhlICJiZXN0IGZpdHRpbmcgbGluZSIgaXMgdGhlIG9uZSB0aGF0IG1pbmltaXplcyB0aGUgc3F1YXJlIG9mIHRoZSBkaXN0YW5jZSBiZXR3ZWVuIGVhY2ggcG9pbnQgYW5kIHRoZSBsaW5lIChpLmUuIHRoZSByZXNpZHVhbCB2YWx1ZXMpIDoKCmBgYHtyIGxlYXN0IHNxdWFyZXN9CiMgRmlyc3QgcnVuIHRoZSBtb2RlbAptb2RlbCA8LSBsbShldXJvX3ByaWNlIH4gd2lkdGgsIGRhdGEgPSBpa2VhX3NvZmFzKQoKYXVnbWVudChtb2RlbCkgJT4lIAogIGdncGxvdChhZXMoeCA9IHdpZHRoLCB5ID0gZXVyb19wcmljZSkpICsKICBnZW9tX3NlZ21lbnQoYWVzKHggPSB3aWR0aCwgeSA9IGV1cm9fcHJpY2UsIHllbmQgPSAuZml0dGVkLCB4ZW5kID0gd2lkdGgpLCBjb2xvciA9ICJyZWQiKSArCiAgZ2VvbV9wb2ludChmaWxsID0gaWtlYV95ZWxsb3csIGNvbG9yID0gImJsYWNrIixzaGFwZSA9IDIxKSArCiAgZ2VvbV9saW5lKGFlcyh5ID0gLmZpdHRlZCksIGNvbG9yID0gaWtlYV95ZWxsb3cpICsKICBzY2FsZV95X2NvbnRpbnVvdXMobGFiZWxzID0gc2NhbGVzOjpkb2xsYXJfZm9ybWF0KHN1ZmZpeCA9ICLigqwiLCBwcmVmaXggPSAiIikpICsKICBsYWJzKHggPSAiU29mYXMgd2lkdGggKGNtKSIsIHkgPSAiUHJpY2UgKOKCrCkiKQoKYGBgCgpXZSBjYW4gbm93IGNoZWNrIG91ciBjb25kaXRpb25zIHVzaW5nIHRoZSBgYXV0b3Bsb3QubG1gIG1ldGhvZCBmcm9tIHRoZSBge2dnZm9ydGlmeX1gIHBhY2thZ2UgOgoKYGBge3IgbGluZSAxLCBmaWcud2lkdGggPSAxMn0KYXV0b3Bsb3QobW9kZWwpICsgdGhlbWVfYncoKQpgYGAKCkhhdmluZyBhIGxvb2sgYXQgdGhlc2UgZ3JhcGhzLCB3ZSBjYW4gc2VlIHRoYXQgdGhlIG5vcm1hbGl0eSBjb25kaXRpb24gc2VlbXMgdG8gYmUgcmVzcGVjdGVkICh0aGUgcmVzaWR1YWxzIGFyZSBnbG9iYWxseSBmaXR0aW5nIHRoZSBRUS1saW5lKS4gVGhlIGVxdWFsaXR5IG9mIHZhcmlhbmNlcyBhbHNvIHNlZW1zIHJlc3BlY3RlZCB3aGVuIGNvbnNpZGVyaW5nIHRoZSByZXNpZHVhbHMgKnZzKiBmaXR0ZWQgZ3JhcGggKGV4Y2VwdCBhIG1pbm9yICJiYW5hbmEiIGZvcm0pLiBXZSBjYW4gYWxzbyBzZWUgdGhhdCB0aGUgMzY1dGggZGF0YSBwb2ludCBzZWVtcyB0byBiZSBhbiBleHRyZW1lL291dGxpZXIgdmFsdWUuIEZvciBmdXR1cmUgYW5hbHlzaXMsIHdlIGRlY2lkZSB0byByZW1vdmUgdGhpcyBwYXJ0aWN1bGFyIHBvaW50IDoKCmBgYHtyIHJlbW92aW5nIG91dGxpZXJ9CmlrZWFfc29mYXMgPC0gaWtlYV9zb2Zhc1stMzY1LF0KYGBgCgpXZSBjYW4gY29uc2lkZXIgdGhhdCBvdXIgbW9kZWwgZG9lc24ndCB2aW9sYXRlIHRoZSBMSU5FIGNvbmRpdGlvbnMuCgojIyBCYXNpYyBtb2RlbAoKV2UgY2FuIGZpdCB0aGUgbW9kZWwgdXNpbmcgdGhlIHVzdWFsIGBsbWAgZnVuY3Rpb24gOgoKYGBge3IgbW9kZWx9Cm1vZGVsIDwtIGxtKGV1cm9fcHJpY2UgfiB3aWR0aCwgZGF0YSA9IGlrZWFfc29mYXMpCgpzdW1tYXJ5KG1vZGVsKQoKYGBgCgpXaGF0IHdlIGNhbiBzYXkgYWJvdXQgdGhlIG1vZGVsIG91dHB1dCA6CgotICAgVGhlIGludGVyY2VwdCB2YWx1ZXMgc2F5cyB0aGF0IGEgc29mYSBvZiAwIGNtIHdvdWxkIGNvc3QgLTIwOC4yOeKCrCwgd2hpY2ggaGFzIG5vIHJlYWwgbWVhbmluZyBoZXJlLgotICAgRm9yIGVhY2ggYWRkaXRpb25hbCBjbSBvZiBgd2lkdGhgICwgdGhlIHByaWNlIGluY3JlYXNlcyBvbiBhdmVyYWdlIGJ5IDQuNDTigqwKLSAgIFRoZSByZWxhdGlvbiBiZXR3ZWVuIHRoZXNlIDIgdmFyaWFibGVzIGlzIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgKFAtdmFsdWUgXDwgMC4wNSkKLSAgIFRoZSBtb2RlbCBleHBsYWlucyBhYm91dCA3MyUgb2YgdGhlIHZhcmlhYmlsaXR5Ci0gICBUaGUgbWVhbiBlcnJvciBpcyBob3dldmVyIHF1aXRlIGhpZ2gsIGJlaW5nIG9mIDI0Ny424oKsLgoKV2l0aCB0aGlzIG1vZGVsLCB3ZSBjb3VsZCB0cnkgdG8gcHJlZGljdCB0aGUgcHJpY2Ugb2YgYSBuZXcgSUtFQSBzb2ZhLCBiYXNlZCBvbiBpdHMgd2lkdGguCgpGb3IgZXhhbXBsZSwgd2UgY2FuIHRyeSB0aGlzIHdpdGggYSBuZXcgc29mYSB3aXRoIGEgd2lkdGggb2YgMTIzIGNtIDoKCmBgYHtyIHByZWRpY3Rpb259CiMgV2UgZXh0cmFjdCBhIHNpbmdsZSBsaW5lIGZyb20gdGhlIGRhdGFzZXQKbmV3X2RhdGEgPC0gZGF0YS5mcmFtZSh3aWR0aCA9IDEyMykKCm5ld19kYXRhJHByZWRpY3RlZF9wcmljZSA8LSBwcmVkaWN0KG1vZGVsLG5ld2RhdGEgPSBuZXdfZGF0YSkKCm5ld19kYXRhCgpgYGAKCmBgYHtyfQpnZ3Bsb3QoaWtlYV9zb2ZhcyxhZXMoeCA9IHdpZHRoLCB5ID0gZXVyb19wcmljZSkpICsKICBnZW9tX3BvaW50KGNvbG9yID0gaWtlYV95ZWxsb3cpICsKICBnZW9tX3BvaW50KGRhdGEgPSBuZXdfZGF0YSwgYWVzKHggPSB3aWR0aCwgeSA9IHByZWRpY3RlZF9wcmljZSksIHNpemUgPSA0LCBjb2xvciA9ICJyZWQiKSArCiAgc2NhbGVfeV9jb250aW51b3VzKGxhYmVscyA9IHNjYWxlczo6ZG9sbGFyX2Zvcm1hdChzdWZmaXggPSAi4oKsIiwgcHJlZml4ID0gIiIpKSArCiAgbGFicyh4ID0gIlNvZmFzIHdpZHRoIChjbSkiLCB5ID0iUHJpY2UgKOKCrCkiKSAgKwogIHRoZW1lKHBhbmVsLmJhY2tncm91bmQgPSBlbGVtZW50X3JlY3QoZmlsbCA9IGlrZWFfYmx1ZSkpCmBgYAoKQnV0IG1heWJlIHdlIGNvdWxkIGJ1aWxkIGEgbW9yZSBjb21wbGV4IG1vZGVsLCBhbHNvIGluY2x1ZGluZyBoZWlnaHQgYW5kIGRlcHRoLgoKIyBNdWx0aXBsZSByZWdyZXNzaW9uCgpgYGB7ciBsaW5lMiwgZmlnLndpZHRoID0gMTJ9Cm1vZGVsIDwtIGxtKGV1cm9fcHJpY2UgfiB3aWR0aCArIGhlaWdodCArIGRlcHRoLCBkYXRhID0gaWtlYV9zb2ZhcykKYXV0b3Bsb3QobW9kZWwpICsgdGhlbWVfYncoKQpgYGAKClRoZSBMSU5FIGNvbmRpdGlvbnMgYXJlIGZ1bGZpbGxlZC4gV2UgY2FuIG5vdyBleHBsb3JlIHRoZSBtb2RlbCBzdW1tYXJ5LgoKYGBge3IgbW9kZWwgMn0Kc3VtbWFyeShtb2RlbCkKYGBgCgpXaGF0IHdlIGNhbiBzYXkgOgoKLSAgIEVhY2ggc2xvcGUgY29lZmZpY2llbnQgaXMgYSBwYXJ0aWFsIHJlZ3Jlc3Npb24gY29lZmZpY2llbnQuIEZvciBleGFtcGxlLCB3ZSBjYW4gc2F5IHRoYXQsIG9uIGF2ZXJhZ2UsIGVhY2ggY20gb2YgYGRlcHRoYCB3aWxsIHJhaXNlIHRoZSBwcmljZSBieSAyLjXigqwgKmFmdGVyIGNvbnRyb2xsaW5nIGZvciB3aWR0aCBhbmQgaGVpZ2h0Ki4KLSAgIFdoaWxlIHdpZHRoIGFuZCBkZXB0aCBzZWVtIHRvIGhhdmUgYSBzaWduaWZpY2FudCBpbXBhY3Qgb24gdGhlIHByaWNlLCBoZWlnaHQgZG9lc24ndAotICAgVGhpcyBtb2RlbCBzdGlsbCBleHBsYWlucyBhYm91dCA3MiUgb2YgdGhlIGRhdGEgdmFyaWFiaWxpdHkKLSAgIFRoZSBtZWFuIGVycm9yIGlzIG5vdyAyNDIuM+KCrAotICAgSXQgc2VlbXMgdGhhdCB3aWR0aCBwbGF5cyB0aGUgbWFqb3Igcm9sZSBmb3IgZGV0ZXJtaW5pbmcgdGhlIHByaWNlIG9mIHNvZmFzCgojIyBVc2luZyB0aGUgbW9kZWwgZm9yIHByZWRpY3Rpb24KClRoaXMgdGltZSwgd2Ugd2lsbCBiZSB1c2luZyBhIHRyYWluL3NldCBkYXRhc2V0cyBjcm9zcy12YWxpZGF0aW9uIG1ldGhvZCB0byBldmFsdWF0ZSB0aGUgcHJlZGljdGl2ZSBwb3dlciBvZiBvdXIgbW9kZWwuIFdlIHdpbGwgYmUgdXNpbmcgODAlIG9mIG91ciBkYXRhc2V0IGFzIGEgInRyYWluIiBzZXQgYW5kIHRoZSBsYXN0IDIwJSBhcyB0aGUgInRlc3QiIHNldC4KCmBgYHtyIG1vZGVsIHRyYWlufQpzZXQuc2VlZCgxMjMpCnRyYWluIDwtIGlrZWFfc29mYXMgJT4lIHNhbXBsZV9mcmFjKDAuOCkKdGVzdCA8LSBhbnRpX2pvaW4oaWtlYV9zb2Zhcyx0cmFpbikgCgptb2RlbCA8LSBsbShldXJvX3ByaWNlIH4gd2lkdGggKyBoZWlnaHQgKyBkZXB0aCwgZGF0YSA9IHRyYWluKQpgYGAKCldlIGNhbiBub3cgdHJ5IHRvIHByZWRpY3QgdGhlIHByaWNlIG9mIGJ1ZmZldHMgaW4gdGhlICJ0ZXN0IiBkYXRhc2V0IDoKCmBgYHtyIGZpbmFsIHByZWRpY3Rpb259CnByZWRpY3Rpb24gPC0gYXVnbWVudChtb2RlbCwgbmV3ZGF0YSA9IHRlc3QpIAoKZ2dwbG90KHByZWRpY3Rpb24sYWVzKHggPSBldXJvX3ByaWNlLCB5ID0gLmZpdHRlZCkpICsKICBnZW9tX3BvaW50KGNvbG9yID0gaWtlYV95ZWxsb3cpICsKICBnZW9tX2FibGluZShpbnRlcmNlcHQgPSAwLCBzbG9wZSA9IDEpICsKICBsYWJzKHggPSAiQWN0dWFsIHByaWNlIiwgeSA9ICJQcmVkaWN0ZWQgcHJpY2UiKQoKYGBgCgpXZSBjYW4gbm93IGV2YWx1YXRlIHRoZSBhY2N1cmFjeSBvZiBvdXIgbW9kZWwgYnkgY29tcHV0aW5nIHRoZSBSb290IE1lYW4gU3F1YXJlIEVycm9yIChSTVNFKSA6CgpgYGB7ciBtZXRyaWNzIGNhbGN1bGF0aW9ufQpSTVNFIDwtIHByZWRpY3Rpb24gJT4lIAogIHN1bW1hcmlzZShzcXVhcmVfZXJyb3IgPSAoZXVyb19wcmljZSAtIC5maXR0ZWQpXjIpICU+JSAKICBzdW1tYXJpc2UobWVhbl9zcXVhcmVfZXJyb3IgPSBtZWFuKHNxdWFyZV9lcnJvcixuYS5ybT1UUlVFKSkgJT4lIAogIHN1bW1hcmlzZShSTVNFID0gc3FydChtZWFuX3NxdWFyZV9lcnJvcikpCgpSTVNFCmBgYAoKVGhlIFJvb3QgTWVhbiBTcXVhcmUgRXJyb3IgKFJNU0UpIG9mIHRoZSBwcmVkaWN0aW9uIGlzIG9mIGByIHJvdW5kKFJNU0UkUk1TRSwyKWDigqwuCgpUaGlzIHByZWRpY3Rpb24gY291bGQgYmUgZW5oYW5jZWQgYnkgdGhlIHVzYWdlIG9mIG90aGVyLCBtb3JlIGNvbXBsZXggcmVncmVzc2lvbiBtZXRob2RzLiBCdXQgd2Ugd2lsbCBzdG9wIGhlcmUgZm9yIG5vdyBhcyBpdCBpcyBhbHJlYWR5IHF1aXRlIGEgbG9uZyBhbmFseXNpcy4K