Importance of Project

Our project goal is to predict the number of installs of apps by looking at app info and it’s reviews. We hope that this project will be helpful to app developers who need to predict their number of installs or to investors who want to pick-out the next big app. Companies may run beta focus groups or app developers may receive feedback from testers and get certain amounts of reviews. We use this and some knowledge about the app to predict its success. Knowing the number of installs can be very helpful to developers and business managers because they can predict the profit. The result of this project may show the importance of reviews to apps in the market as it could be one of the determining factors for the number of installs.

What will be done? Data science can be summarized in to five steps: capture, maintain, process, analyze and communicate. We gather the data that has meaningful variable leading to appropriate classes. Then clean the data in a way that it is easy for computer to read and process modeling. We apply algorithms to train model and test it using the set of data acquired above and analyze the performance of model. We then view the results and attempt to extract any relevant learning or information.

This Document consists of the following parts in the context of Data Mining:

  1. data collection and cleaning
  2. visualization
  3. missing value imputation
  4. feature engineering
  5. classification (SVM)+ (Tree)+(Random Forest) and model comparison
  6. conclusion
  7. limitations.

We will select the best model that is able to predict the number of installments of Apps most accurately and figure out what characters that will influence the installments of a certain App.


#1.Loading Data

gg = read.csv("googleplaystore.csv")
review = read.csv("googleplaystore_user_reviews.csv")
library(e1071)
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
── Attaching packages ───────────────────────────────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.3.3     ✓ purrr   0.3.4
✓ tibble  3.0.4     ✓ dplyr   1.0.2
✓ tidyr   1.1.2     ✓ stringr 1.4.0
✓ readr   1.4.0     ✓ forcats 0.5.0
── Conflicts ──────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
review1 = review %>% select(App, Translated_Review)
head(review1)
knitr::kable(head(review1))
Registered S3 methods overwritten by 'htmltools':
  method               from         
  print.html           tools:rstudio
  print.shiny.tag      tools:rstudio
  print.shiny.tag.list tools:rstudio

App Translated_Review
10 Best Foods for You I like eat delicious food. That’s I’m cooking food myself, case “10 Best Foods” helps lot, also “Best Before (Shelf Life)”
10 Best Foods for You This help eating healthy exercise regular basis
10 Best Foods for You nan
10 Best Foods for You Works great especially going grocery store
10 Best Foods for You Best idea us
10 Best Foods for You Best way

head(review)
head(gg)

#2. Data Preprocessing There two data-sets from Kaggle(https://www.kaggle.com/lava18/google-play-store-apps) for app reviews, one is the list of apps with information. It has information such as app name, category, rating and more. And the other is a list of reviews for each app with sentiment if that particular content of review was positive, neutral or negative. We could not directly use these two files as they are not joined.

First, we have analysed which information column is irrelevant to the number of installs of the app. It was done by common sense. We removed size, last updated date, current version and android version because they are not the factor that would affect the number of installs before publishing. Also, we have removed rating and number of reviews because they are obviously associated with app installs and they would not be known before publishing. Then we also trimed our data of any out of place characters.

We combined Category and Genre by categorizing common key words and added the list of the key categories to columns so that each app’s attributes for category/genre can be expressed as true or false. Moreover, we have gathered the sentiment of reviews for each app and calculated the total number of positive, neutral and negative to get the percentages. Finally we combined those three columns to the existing app list.

this will include removing bad values, siplitting binary values, clean text values and Split Catagorical Values, clean numerical values, mearging rows and Droping Columns

str(gg)
'data.frame':   10841 obs. of  13 variables:
 $ App           : Factor w/ 9660 levels "- Free Comics - Comic Apps",..: 7206 2551 8970 8089 7272 7103 8149 5568 4926 5806 ...
 $ Category      : Factor w/ 34 levels "1.9","ART_AND_DESIGN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Rating        : num  4.1 3.9 4.7 4.5 4.3 4.4 3.8 4.1 4.4 4.7 ...
 $ Reviews       : Factor w/ 6002 levels "0","1","10","100",..: 1183 5924 5681 1947 5924 1310 1464 3385 816 485 ...
 $ Size          : Factor w/ 462 levels "1,000+","1.0M",..: 55 30 368 102 64 222 55 118 146 120 ...
 $ Installs      : Factor w/ 22 levels "0","0+","1,000,000,000+",..: 8 20 13 16 11 17 17 4 4 8 ...
 $ Type          : Factor w/ 4 levels "0","Free","NaN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Price         : Factor w/ 93 levels "$0.99","$1.00",..: 92 92 92 92 92 92 92 92 92 92 ...
 $ Content.Rating: Factor w/ 7 levels "","Adults only 18+",..: 3 3 3 6 3 3 3 3 3 3 ...
 $ Genres        : Factor w/ 120 levels "Action","Action;Action & Adventure",..: 10 13 10 10 12 10 10 10 10 12 ...
 $ Last.Updated  : Factor w/ 1378 levels "1.0.19","April 1, 2016",..: 562 482 117 825 757 901 76 726 1317 670 ...
 $ Current.Ver   : Factor w/ 2834 levels "","0.0.0.2","0.0.1",..: 122 1020 468 2827 280 116 280 2393 1457 1431 ...
 $ Android.Ver   : Factor w/ 35 levels "","1.0 and up",..: 17 17 17 20 22 10 17 20 12 17 ...

There are a lot of factor variables which should actually be converted to numeric variables.

##2.1 Converting variable types(imputation)

library(lubridate)

Attaching package: ‘lubridate’

The following objects are masked from ‘package:base’:

    date, intersect, setdiff, union
library(tidyverse)
library(dplyr)
gg.new <- gg %>%
  mutate(
    # Eliminate "+" to transform Installs to numeric variable
   # Installs = gsub("\\+", "", as.character(Installs)),
   # Installs = as.numeric(gsub(",", "", Installs)),
    # Eliminate "M" to transform Size to numeric variable
    Size = gsub("M", "", Size),
    # For cells with k, divide it by 1024, since 1024kB = 1MB, the unit for size is MB
    Size = ifelse(grepl("k", Size),as.numeric(gsub("k", "", Size))/1024, as.numeric(Size)),
    # Transform reviews to numeric
    Reviews = as.numeric(Reviews),
    # Remove "$" from Price to transform it to numeric
    Price = as.numeric(gsub("\\$", "", as.character(Price))),
    # Convert Last Updated to date format
    Last.Updated = mdy(Last.Updated),
    # Replace "Varies with device" to NA since it is unknown
    Min.Android.Ver = gsub("Varies with device", NA, Android.Ver),
    # Keep only version number to 1 decimal as it's most representative
    Min.Android.Ver = as.numeric(substr(Min.Android.Ver, start = 1, stop = 3)),
    # Drop old Android version column
    Android.Ver = NULL
  ) %>% 
  filter(
    # Two apps had type as 0 or NA, they will be removed 
    Type %in% c("Free", "Paid")
 )
Problem with `mutate()` input `Size`.
ℹ NAs introduced by coercion
ℹ Input `Size` is `ifelse(...)`.NAs introduced by coercionProblem with `mutate()` input `Size`.
ℹ NAs introduced by coercion
ℹ Input `Size` is `ifelse(...)`.NAs introduced by coercionProblem with `mutate()` input `Price`.
ℹ NAs introduced by coercion
ℹ Input `Price` is `as.numeric(gsub("\\$", "", as.character(Price)))`.NAs introduced by coercionProblem with `mutate()` input `Last.Updated`.
ℹ  1 failed to parse.
ℹ Input `Last.Updated` is `mdy(Last.Updated)`. 1 failed to parse.
str(gg.new)
'data.frame':   10839 obs. of  13 variables:
 $ App            : Factor w/ 9660 levels "- Free Comics - Comic Apps",..: 7206 2551 8970 8089 7272 7103 8149 5568 4926 5806 ...
 $ Category       : Factor w/ 34 levels "1.9","ART_AND_DESIGN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Rating         : num  4.1 3.9 4.7 4.5 4.3 4.4 3.8 4.1 4.4 4.7 ...
 $ Reviews        : num  1183 5924 5681 1947 5924 ...
 $ Size           : num  19 14 8.7 25 2.8 5.6 19 29 33 3.1 ...
 $ Installs       : Factor w/ 22 levels "0","0+","1,000,000,000+",..: 8 20 13 16 11 17 17 4 4 8 ...
 $ Type           : Factor w/ 4 levels "0","Free","NaN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Price          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Content.Rating : Factor w/ 7 levels "","Adults only 18+",..: 3 3 3 6 3 3 3 3 3 3 ...
 $ Genres         : Factor w/ 120 levels "Action","Action;Action & Adventure",..: 10 13 10 10 12 10 10 10 10 12 ...
 $ Last.Updated   : Date, format: "2018-01-07" "2018-01-15" ...
 $ Current.Ver    : Factor w/ 2834 levels "","0.0.0.2","0.0.1",..: 122 1020 468 2827 280 116 280 2393 1457 1431 ...
 $ Min.Android.Ver: num  4 4 4 4.2 4.4 2.3 4 4.2 3 4 ...
options(scipen=999)
table(gg.new$Installs)

             0             0+ 1,000,000,000+     1,000,000+         1,000+ 
             0             14             58           1579            907 
            1+    10,000,000+        10,000+            10+   100,000,000+ 
            67           1252           1054            386            409 
      100,000+           100+     5,000,000+         5,000+             5+ 
          1169            719            752            477             82 
   50,000,000+        50,000+            50+   500,000,000+       500,000+ 
           289            479            205             72            539 
          500+           Free 
           330              0 
gg.new$Installs%>%str()%>% print
 Factor w/ 22 levels "0","0+","1,000,000,000+",..: 8 20 13 16 11 17 17 4 4 8 ...
NULL
gg.new %>% filter(Installs == "500,000") %>% print
library(highcharter)
Registered S3 method overwritten by 'htmlwidgets':
  method           from         
  print.htmlwidget tools:rstudio
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table     
gg.new %>% select(-Min.Android.Ver) %>% 
    summarise_all(
        funs(sum(is.na(.)))
    ) %>%
  gather() %>%
  # Only show columns with NA
  filter(value> 1) %>%
  arrange(-value) %>%
    hchart('column', hcaes(x = 'key', y = 'value', color = 'key')) %>%
  hc_add_theme(hc_theme_elementary()) %>%
  hc_title(text = "Columns with Missing Value")
`funs()` is deprecated as of dplyr 0.8.0.
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.

boxplot of different Installment categories

ggplot(data = gg.new) +
  geom_boxplot(aes(x = reorder(Installs.cat, -Rating), y = Rating)) + 
  labs(x = "Installment Categories",y = "Rating")

##2.3 Delete duplicated rows

# number of observations before deleting duplicated rows
(original_num_rows = nrow(gg.new))
[1] 10839
gg.new.uniq = gg.new %>% distinct
# number of rows after delete duplicated rows
(uniq_num_rows = nrow(gg.new.uniq))
[1] 10356
# number of duplicated rows
(dup_rows = original_num_rows - uniq_num_rows)
[1] 483

##2.4 Merge Category into 6

# gg.new.uniq %>% filter (!is.na(Category)) %>% print
levels(gg.new.uniq$Category)
 [1] "1.9"                 "ART_AND_DESIGN"      "AUTO_AND_VEHICLES"  
 [4] "BEAUTY"              "BOOKS_AND_REFERENCE" "BUSINESS"           
 [7] "COMICS"              "COMMUNICATION"       "DATING"             
[10] "EDUCATION"           "ENTERTAINMENT"       "EVENTS"             
[13] "FAMILY"              "FINANCE"             "FOOD_AND_DRINK"     
[16] "GAME"                "HEALTH_AND_FITNESS"  "HOUSE_AND_HOME"     
[19] "LIBRARIES_AND_DEMO"  "LIFESTYLE"           "MAPS_AND_NAVIGATION"
[22] "MEDICAL"             "NEWS_AND_MAGAZINES"  "PARENTING"          
[25] "PERSONALIZATION"     "PHOTOGRAPHY"         "PRODUCTIVITY"       
[28] "SHOPPING"            "SOCIAL"              "SPORTS"             
[31] "TOOLS"               "TRAVEL_AND_LOCAL"    "VIDEO_PLAYERS"      
[34] "WEATHER"            
mydata1 = gg.new.uniq %>% filter(Category != 1.9) %>% mutate(Cat.cat = fct_collapse(Category,
                                                        Education = c("EDUCATION", "BOOKS_AND_REFERENCE", "LIBRARIES_AND_DEMO", "ART_AND_DESIGN"),
                                                        Personalization = c("PERSONALIZATION", "BEAUTY", "SHOPPING", "DATING", "PHOTOGRAPHY"),
                                                        Lifestyle = c("HEALTH_AND_FITNESS", "MEDICAL", "LIFESTYLE", "SPORTS", "FOOD_AND_DRINK"),
                                                        Family = c("FAMILY", "PARENTING", "HOUSE_AND_HOME", "1.9"),
                                                        Entertainment = c("ENTERTAINMENT", "GAME", "COMICS", "VIDEO_PLAYERS"), 
                                                        Business = c("BUSINESS", "FINANCE", "PRODUCTIVITY", "TOOLS", "NEWS_AND_MAGAZINES", "EVENTS", "SOCIAL", "COMMUNICATION"),
                                                        Travel = c("MAPS_AND_NAVIGATION", "AUTO_AND_VEHICLES", "TRAVEL_AND_LOCAL", "WEATHER")))
mydata2 = mydata1 %>% mutate(Interval = difftime(time1 = today(), time2 = Last.Updated))
str(mydata2)
'data.frame':   10356 obs. of  16 variables:
 $ App            : Factor w/ 9660 levels "- Free Comics - Comic Apps",..: 7206 2551 8970 8089 7272 7103 8149 5568 4926 5806 ...
 $ Category       : Factor w/ 34 levels "1.9","ART_AND_DESIGN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Rating         : num  4.1 3.9 4.7 4.5 4.3 4.4 3.8 4.1 4.4 4.7 ...
 $ Reviews        : num  1183 5924 5681 1947 5924 ...
 $ Size           : num  19 14 8.7 25 2.8 5.6 19 29 33 3.1 ...
 $ Installs       : Factor w/ 22 levels "0","0+","1,000,000,000+",..: 8 20 13 16 11 17 17 4 4 8 ...
 $ Type           : Factor w/ 4 levels "0","Free","NaN",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Price          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Content.Rating : Factor w/ 7 levels "","Adults only 18+",..: 3 3 3 6 3 3 3 3 3 3 ...
 $ Genres         : Factor w/ 120 levels "Action","Action;Action & Adventure",..: 10 13 10 10 12 10 10 10 10 12 ...
 $ Last.Updated   : Date, format: "2018-01-07" "2018-01-15" ...
 $ Current.Ver    : Factor w/ 2834 levels "","0.0.0.2","0.0.1",..: 122 1020 468 2827 280 116 280 2393 1457 1431 ...
 $ Min.Android.Ver: num  4 4 4 4.2 4.4 2.3 4 4.2 3 4 ...
 $ Installs.cat   : Factor w/ 3 levels "low","high","medium": 3 3 2 2 3 3 3 2 2 3 ...
 $ Cat.cat        : Factor w/ 7 levels "Family","Education",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Interval       : 'difftime' num  1095 1087 889 943 ...
  ..- attr(*, "units")= chr "days"
mydata2 %>% filter(Installs.cat == "low") %>% print

Impute missing values

#missForest
library(missForest)
Loading required package: randomForest
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.

Attaching package: ‘randomForest’

The following object is masked from ‘package:dplyr’:

    combine

The following object is masked from ‘package:ggplot2’:

    margin

Loading required package: foreach

Attaching package: ‘foreach’

The following objects are masked from ‘package:purrr’:

    accumulate, when

Loading required package: itertools
Loading required package: iterators
#impute missing values, using all parameters as default values
gg.new.imp <- missForest(data.matrix(mydata2), maxiter = 5, ntree = 10)
  missForest iteration 1 in progress...done!
  missForest iteration 2 in progress...done!
  missForest iteration 3 in progress...done!
#check imputed values
# gg.new.imp$ximp
#check imputation error
gg.new.imp$OOBerror
      NRMSE 
0.001106975 

Removing Out-layers For sentiment scores, some of the data becomes out-layers. We remove those outlayrs by cutting the head of the positive sentiment distribution at 0.2. Similarly, we did the same pruning for neutral and negative sentiment score distributions by cutting the tails of the distribution at 0.4 and 0.5 respectively.

get the semantic score

# install.packages("stringr")
# install.packages("tidytext")
library(stringr)
library(tidytext)
# read in user reviews
user_review = read.csv("googleplaystore_user_reviews.csv")
str(user_review)
'data.frame':   64295 obs. of  5 variables:
 $ App                   : Factor w/ 1074 levels "10 Best Foods for You",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Translated_Review     : Factor w/ 27996 levels "","___ ___ ___ ___ ___ 0",..: 9279 23853 17229 27355 2076 2168 1032 17229 15968 13280 ...
 $ Sentiment             : Factor w/ 4 levels "nan","Negative",..: 4 4 1 4 4 4 4 1 3 3 ...
 $ Sentiment_Polarity    : num  1 0.25 NaN 0.4 1 1 0.6 NaN 0 0 ...
 $ Sentiment_Subjectivity: num  0.533 0.288 NaN 0.875 0.3 ...
user_review %>% print
head(user_review)
# get sentiment data frame
sents = get_sentiments("afinn") %>% print
range(sents$score)
Unknown or uninitialised column: `score`.no non-missing arguments to min; returning Infno non-missing arguments to max; returning -Inf
[1]  Inf -Inf
# left join the sentiment chart and the user reviews to get score
t1 = user_review %>% mutate(review = as.character(Translated_Review)) %>% unnest_tokens(word, review)
# t2 = user_review[1:500, ]
user_score = left_join(t1, sents) %>% group_by(App) %>% summarise(n = n(), score=sum(t1$score, na.rm=T)) %>% mutate(avg.score = score / n) %>% print
Joining, by = "word"
`summarise()` ungrouping output (override with `.groups` argument)
# range(user_score $ avg.score)
user_review %>% group_by(App) %>% count
t11 = user_score %>% inner_join(gg.new) %>% filter(Installs != 5000) %>% filter(Installs != 1000000000)
Joining, by = "App"
ggplot(t11) + geom_line(aes(x = Installs, y = avg.score))

ggplot(t11) + geom_boxplot(aes(x = reorder(as.factor(Installs), -avg.score), y = avg.score)) + labs(x = "Installments", y = "Average Score") + coord_flip()

# recover app name after data imputation
# add num_row to gg.new
mydata2 = mydata2 %>% mutate(r = row_number()) 
# split data into training and test data
# change the list to data frame 
gg.df = gg.new.imp[[1]] %>% unlist()
gg.data = data.frame(gg.df) %>% mutate(r = row_number()) 
t1 = left_join(gg.data, mydata2, by = "r") %>% 
  select(Rating.x, Reviews.y, Size.x, Installs.cat.y, Price.y, Content.Rating.y, Cat.cat.y, Interval.y) %>% print
# split data
(total_row = nrow(t1))
[1] 10356
ins.l= which(t1$Installs.cat.y == "low")
ins.m= which(t1$Installs.cat.y == "medium")
ins.h= which(t1$Installs.cat.y == "high")
train.id = c(sample(ins.l, size = trunc(0.8 *length(ins.l))),
             sample(ins.m, size = trunc(0.8 *length(ins.m))), 
             sample(ins.h, size = trunc(0.8 *length(ins.h))))
train.gg = t1[train.id, ]
test.gg = t1[-train.id, ]
levels(train.gg$`Installs`)
[1] "low"    "high"   "medium"
table(train.gg$`Installs`)

   low   high medium 
  2519   3243   2522 
# random forest
set.seed(415)
library(randomForest)
table(factor(train.gg$Installs.cat.y))

   low   high medium 
  2519   3243   2522 
bag.gg=randomForest(Installs.cat.y~., data=train.gg, mtry = ncol(train.gg) - 1,importance=TRUE)
bag.gg

Call:
 randomForest(formula = Installs.cat.y ~ ., data = train.gg, mtry = ncol(train.gg) -      1, importance = TRUE) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 7

        OOB estimate of  error rate: 34.8%
Confusion matrix:
        low high medium class.error
low    1646  281    592   0.3465661
high    130 2490    623   0.2321924
medium  393  864   1265   0.4984140
# plot
yhat.bag = predict(bag.gg, newdata=test.gg) 
# test error
(forest.test.err = mean(yhat.bag != test.gg$Installs.cat.y))
[1] 0.3412162
# get the importance
importance(bag.gg)
                       low      high   medium MeanDecreaseAccuracy MeanDecreaseGini
Rating.x          80.84635 139.63170 38.15214            155.73626         907.3467
Reviews.y        158.91254 122.63245 52.73638            181.03664        1643.7866
Size.x            34.04464 141.46910 27.05773            133.08066        1085.8908
Price.y           57.56282 135.45496 28.00605            121.66703         169.3639
Content.Rating.y  17.80934  12.88191 11.94957             22.99240         130.3032
Cat.cat.y         16.41888  96.11477 17.44344             79.89653         354.5948
Interval.y        38.12610 147.01826 24.95748            139.40041        1189.3941
varImpPlot(bag.gg)

# tree
set.seed(415)
library(tree)
Registered S3 method overwritten by 'tree':
  method     from
  print.tree cli 
#train.gg
#colnames(train.gg)[1] = "Rating"
#colnames(train.gg)[2] = "Reviews"
#colnames(train.gg)[3] = "Size"
#colnames(train.gg)[5] = "Price"
#colnames(train.gg)[6] = "Content Rating"
#colnames(train.gg)[7] = "Category"
#colnames(train.gg)[1] = "Time Since Last Update"
#train.gg
train.gg
tree.gg = tree(Installs.cat.y~., data = train.gg)
NAs introduced by coercion
summary(tree.gg)

Classification tree:
tree(formula = Installs.cat.y ~ ., data = train.gg)
Variables actually used in tree construction:
[1] "Reviews.y" "Size.x"    "Rating.x"  "Price.y"  
Number of terminal nodes:  8 
Residual mean deviance:  1.682 = 13920 / 8276 
Misclassification error rate: 0.4045 = 3351 / 8284 
plot(tree.gg)
text(tree.gg, pretty = 1, cex = 1)

yhat.tree = predict(tree.gg, newdata=test.gg) 
NAs introduced by coercion
# test error
(tree.test.err = mean(yhat.tree != test.gg$Installs.cat.y))
[1] 1
# prune the tree
cv.gg.tree=cv.tree(tree.gg,FUN=prune.misclass)
NAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercionNAs introduced by coercion
cv.gg.tree
$size
[1] 8 7 6 5 4 3 2 1

$dev
[1] 3410 3480 3567 3592 3690 3795 4300 5041

$k
[1] -Inf   61   77   84  105  128  494  741

$method
[1] "misclass"

attr(,"class")
[1] "prune"         "tree.sequence"
# par(mfrow=c(1,2))
# plot(cv.gg.tree$size,cv.gg.tree$dev / length(train.gg),ylab="cv error", xlab="size",type="b")
# plot(cv.gg.tree$k, cv.gg.tree$dev / length(train.gg),ylab="cv error", xlab="k",type="b")
# predict using pruning tree
prune.tree=prune.misclass(tree.gg,best=8)
tree.pred=predict(prune.tree, test.gg,type="class")
NAs introduced by coercion
table(tree.pred, test.gg$Installs.cat.y)
         
tree.pred low high medium
   low    257    0     27
   high    71  554    161
   medium 302  257    443
(test.tree.err = mean(tree.pred != test.gg$Installs.cat.y)) 
[1] 0.3947876
# plot the tree
plot(prune.tree)
text(prune.tree, pretty = 0, cex = 1)

As we can see in both single tree and random forest, reviews is the most important predictor. When we dig into the reviews, we figure out that approxiamtely 1000 apps have more than 100 relevant text reviews / comments.

SVM on traning set

set.seed(415)
# get data frame ready to use
train.gg
table(factor(train.gg$Installs.cat.y))

   low   high medium 
  2519   3243   2522 
costVals = c(1, 5, 10, 50)
# linear kernel
# running too slow, be careful to change predictors
svm1 <- tune(svm, as.factor(Installs.cat.y) ~ ., data = train.gg,
             kernel = "linear",
             ranges = list("cost" = costVals)) 
summary(svm1)

Parameter tuning of ‘svm’:

- sampling method: 10-fold cross validation 

- best parameters:

- best performance: 0.4531633 

- Detailed performance results:
# find the best cost under linear kernel
best_mod_linear = svm1$best.model
summary(best_mod_linear)

Call:
best.tune(method = svm, train.x = as.factor(Installs.cat.y) ~ ., data = train.gg, 
    ranges = list(cost = costVals), kernel = "linear")


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  linear 
       cost:  50 

Number of Support Vectors:  6926

 ( 2236 2441 2249 )


Number of Classes:  3 

Levels: 
 low high medium
# thus the cost of the best model si 50.
# get the test error of the best model of the linear kernel
test.gg %>% str()
'data.frame':   2072 obs. of  8 variables:
 $ Rating.x        : num  4.1 4.5 4.2 4.6 4.5 4.18 4.7 4.2 3.8 4 ...
 $ Reviews.y       : num  1183 1947 3914 3830 2561 ...
 $ Size.x          : num  19 25 20 21 17 7 25 11 9.2 9.4 ...
 $ Installs.cat.y  : Factor w/ 3 levels "low","high","medium": 3 2 2 3 3 3 3 3 3 3 ...
 $ Price.y         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Content.Rating.y: Factor w/ 7 levels "","Adults only 18+",..: 3 6 6 3 3 3 3 3 3 3 ...
 $ Cat.cat.y       : Factor w/ 7 levels "Family","Education",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Interval.y      : 'difftime' num  1095 943 1010 925 ...
  ..- attr(*, "units")= chr "days"
pred_test_linear = predict(best_mod_linear, newdata = test.gg)
table(predict = pred_test_linear, truth = test.gg$Installs.cat.y)
        truth
predict  low high medium
  low    348  102    215
  high   156  640    229
  medium 126   69    187
(test_err_linear = mean(pred_test_linear != test.gg$Installs.cat.y))
[1] 0.4329151
set.seed(415)
# kernel radial
gammaVals = c(1, 2, 3, 4)
svm_radial <-tune(svm, as.factor(Installs.cat.y) ~ ., data = train.gg, 
                  kernel = "radial",
                  cost = 100,
                               gamma =gammaVals)
summary(svm_radial)

Error estimation of ‘svm’ using 10-fold cross validation: 0.4438649
best_mod_radial = svm_radial$best.model
summary(best_mod_radial)

Call:
best.tune(method = svm, train.x = as.factor(Installs.cat.y) ~ ., data = train.gg, 
    kernel = "radial", cost = 100, gamma = gammaVals)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  100 

Number of Support Vectors:  5928

 ( 1721 2203 2004 )


Number of Classes:  3 

Levels: 
 low high medium
# get test error of kernel of the radial
pred_test_radial = predict(best_mod_radial, newdata = test.gg)
(test_err_radial = mean(pred_test_radial != test.gg$Installs.cat.y))
[1] 0.4333977

Is it true that people tends to give text review when they highly positively review the app?

# left join the user_score table and t3
mydata2 = mydata2 %>% mutate(r = row_number()) %>% print 
gg.df = gg.new.imp[[1]] %>% unlist()
gg.data = data.frame(gg.df) %>% mutate(r = row_number()) %>% print
t3 = left_join(gg.data, mydata2, by = "r") %>% 
  select(Rating.x, Reviews.y, App.y, Installs.cat.y) %>% print
colnames(t3)[3] = "App"
t2 = inner_join(user_score, t3, by = "App") %>% print
# raing and avg score
# add main title manually, which is "rating vs aaverage sentimental score"
ggplot(data = t2, aes(x = Rating.x, y = avg.score)) + geom_bar(stat = "identity") + labs(x = "Rating", y = "Average Sentimental Score", title = "Rating vs Average sentimental Score") 

ggplot(data = t2, aes(x = as.factor(Installs.cat.y), y = avg.score)) + geom_boxplot() + labs(x = "Installment Category", y = "Average Sentimental Score")

#boxplot(t2$Installs.cat.y ~ t2$avg.score)
# rating vs reviews
ggplot(data = t2, aes(x = Reviews.y, y = avg.score)) + geom_bar(stat = "identity") + labs(x = "Number of #Reviews", y = "Average Sentimental Score", title = "Number of Reviews vs Average sentimental Score") 

High avg score tends to concentrated at rating above and including 4.0

data frame that might not be used

final1 = left_join(gg.data, mydata2, by = "r") %>% select(App.y, Reviews.y, Rating.x, Interval.y, Size.x, Price.y, Cat.cat.y, Content.Rating.y) %>% print
colnames(final1)[1] = "App"
colnames(final1)[2] = "Reviews"
colnames(final1)[3] = "Rating"
colnames(final1)[4] = "Interval"
colnames(final1)[5] = "Size"
colnames(final1)[6] = "Price"
colnames(final1)[7] = "Category"
colnames(final1)[8] = "Content"
show((final1))
plot(final1)

This problem was very difficult as it seems nothing we did was able to increase the score. This is likely because the people don’t really know what they want and often the best product such as the one with the best reviews doesn’t actually do that well. We can see from the sentiment graphing that there seemed not to be any major correlation between good reviews and installs. A possible way this project could have been more successful was an early focus on changing to a regression problem as while the data while grouped could be turned back into a number. In the end the best we could do with all the classes was 56% with the Network and the best tree was 42%. With the reduced classes the best we could achieve was 62% with the ANN and 52% with the Tree. Overall getting even 62% seems unsuccessful it is not very far from just majority guessing. But I suspect it would be difficult to achieve higher results; none of the attributes seems to be able to predict the installs well. However calling the project a complete failure is not correct as we were able to find some methods to improve the accuracy through column selection and while we made the problem easier by reducing the number of class.it became more consistent. an easier problem with higher success would be a safer bet for those looking to invest.

LS0tCnRpdGxlOiAiR29vZ2xlUGxheVN0b3JlLUFuYWx5c2lzIgphdXRob3I6ICJLaGF3bGEtQmFueURvbWkiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KIyBJbXBvcnRhbmNlIG9mIFByb2plY3QKT3VyIHByb2plY3QgZ29hbCBpcyB0byBwcmVkaWN0IHRoZSBudW1iZXIgb2YgaW5zdGFsbHMgb2YgYXBwcyBieSBsb29raW5nIGF0IGFwcCBpbmZvIGFuZCBpdOKAmXMgcmV2aWV3cy4gV2UgaG9wZSB0aGF0IHRoaXMgcHJvamVjdCB3aWxsIGJlIGhlbHBmdWwgdG8gYXBwIGRldmVsb3BlcnMgd2hvIG5lZWQgdG8gcHJlZGljdCB0aGVpciBudW1iZXIgb2YgaW5zdGFsbHMgb3IgdG8gaW52ZXN0b3JzIHdobyB3YW50IHRvIHBpY2stb3V0IHRoZSBuZXh0IGJpZyBhcHAuIENvbXBhbmllcyBtYXkgcnVuIGJldGEgZm9jdXMgZ3JvdXBzIG9yIGFwcCBkZXZlbG9wZXJzIG1heSByZWNlaXZlIGZlZWRiYWNrIGZyb20gdGVzdGVycyBhbmQgZ2V0IGNlcnRhaW4gYW1vdW50cyBvZiByZXZpZXdzLiBXZSB1c2UgdGhpcyBhbmQgc29tZSBrbm93bGVkZ2UgYWJvdXQgdGhlIGFwcCB0byBwcmVkaWN0IGl0cyBzdWNjZXNzLiBLbm93aW5nIHRoZSBudW1iZXIgb2YgaW5zdGFsbHMgY2FuIGJlIHZlcnkgaGVscGZ1bCB0byBkZXZlbG9wZXJzIGFuZCBidXNpbmVzcyBtYW5hZ2VycyBiZWNhdXNlIHRoZXkgY2FuIHByZWRpY3QgdGhlIHByb2ZpdC4gVGhlIHJlc3VsdCBvZiB0aGlzIHByb2plY3QgbWF5IHNob3cgdGhlIGltcG9ydGFuY2Ugb2YgcmV2aWV3cyB0byBhcHBzIGluIHRoZSBtYXJrZXQgYXMgaXQgY291bGQgYmUgb25lIG9mIHRoZSBkZXRlcm1pbmluZyBmYWN0b3JzIGZvciB0aGUgbnVtYmVyIG9mIGluc3RhbGxzLgoKV2hhdCB3aWxsIGJlIGRvbmU/CkRhdGEgc2NpZW5jZSBjYW4gYmUgc3VtbWFyaXplZCBpbiB0byBmaXZlIHN0ZXBzOiBjYXB0dXJlLCBtYWludGFpbiwgcHJvY2VzcywgYW5hbHl6ZSBhbmQgY29tbXVuaWNhdGUuIFdlIGdhdGhlciB0aGUgZGF0YSB0aGF0IGhhcyBtZWFuaW5nZnVsIHZhcmlhYmxlIGxlYWRpbmcgdG8gYXBwcm9wcmlhdGUgY2xhc3Nlcy4gVGhlbiBjbGVhbiB0aGUgZGF0YSBpbiBhIHdheSB0aGF0IGl0IGlzIGVhc3kgZm9yIGNvbXB1dGVyIHRvIHJlYWQgYW5kIHByb2Nlc3MgbW9kZWxpbmcuIFdlIGFwcGx5IGFsZ29yaXRobXMgdG8gdHJhaW4gbW9kZWwgYW5kIHRlc3QgaXQgdXNpbmcgdGhlIHNldCBvZiBkYXRhIGFjcXVpcmVkIGFib3ZlIGFuZCBhbmFseXplIHRoZSBwZXJmb3JtYW5jZSBvZiBtb2RlbC4gV2UgdGhlbiB2aWV3IHRoZSByZXN1bHRzIGFuZCBhdHRlbXB0IHRvIGV4dHJhY3QgYW55IHJlbGV2YW50IGxlYXJuaW5nIG9yIGluZm9ybWF0aW9uLgoKIyBUaGlzIERvY3VtZW50IGNvbnNpc3RzIG9mIHRoZSBmb2xsb3dpbmcgcGFydHMgaW4gdGhlIGNvbnRleHQgb2YgRGF0YSBNaW5pbmc6CgogMS4gZGF0YSBjb2xsZWN0aW9uIGFuZCBjbGVhbmluZwogMi4gdmlzdWFsaXphdGlvbgogMy4gbWlzc2luZyB2YWx1ZSBpbXB1dGF0aW9uCiA0LiBmZWF0dXJlIGVuZ2luZWVyaW5nCiA1LiBjbGFzc2lmaWNhdGlvbiAoU1ZNKSsgKFRyZWUpKyhSYW5kb20gRm9yZXN0KSBhbmQgbW9kZWwgY29tcGFyaXNvbgogNi4gY29uY2x1c2lvbgogNy4gbGltaXRhdGlvbnMuCgojIFdlIHdpbGwgc2VsZWN0IHRoZSBiZXN0IG1vZGVsIHRoYXQgaXMgYWJsZSB0byBwcmVkaWN0IHRoZSBudW1iZXIgb2YgaW5zdGFsbG1lbnRzIG9mIEFwcHMgbW9zdCBhY2N1cmF0ZWx5IGFuZCBmaWd1cmUgb3V0IHdoYXQgY2hhcmFjdGVycyB0aGF0IHdpbGwgaW5mbHVlbmNlIHRoZSBpbnN0YWxsbWVudHMgb2YgYSBjZXJ0YWluIEFwcC4KCi0tLQoKIzEuTG9hZGluZyBEYXRhCgpgYGB7ciB9CmdnID0gcmVhZC5jc3YoImdvb2dsZXBsYXlzdG9yZS5jc3YiKQpyZXZpZXcgPSByZWFkLmNzdigiZ29vZ2xlcGxheXN0b3JlX3VzZXJfcmV2aWV3cy5jc3YiKQpsaWJyYXJ5KGUxMDcxKQpsaWJyYXJ5KHRpZHl2ZXJzZSkKcmV2aWV3MSA9IHJldmlldyAlPiUgc2VsZWN0KEFwcCwgVHJhbnNsYXRlZF9SZXZpZXcpCmhlYWQocmV2aWV3MSkKa25pdHI6OmthYmxlKGhlYWQocmV2aWV3MSkpCmhlYWQocmV2aWV3KQpoZWFkKGdnKQpgYGAKCiMyLiBEYXRhIFByZXByb2Nlc3NpbmcKVGhlcmUgdHdvIGRhdGEtc2V0cyBmcm9tIEthZ2dsZShodHRwczovL3d3dy5rYWdnbGUuY29tL2xhdmExOC9nb29nbGUtcGxheS1zdG9yZS1hcHBzKSBmb3IgYXBwIHJldmlld3MsIG9uZSBpcyB0aGUgbGlzdCBvZiBhcHBzIHdpdGggaW5mb3JtYXRpb24uIEl0IGhhcyBpbmZvcm1hdGlvbiBzdWNoIGFzIGFwcCBuYW1lLCBjYXRlZ29yeSwgcmF0aW5nIGFuZCBtb3JlLiBBbmQgdGhlIG90aGVyIGlzIGEgbGlzdCBvZiByZXZpZXdzIGZvciBlYWNoIGFwcCB3aXRoIHNlbnRpbWVudCBpZiB0aGF0IHBhcnRpY3VsYXIgY29udGVudCBvZiByZXZpZXcgd2FzIHBvc2l0aXZlLCBuZXV0cmFsIG9yIG5lZ2F0aXZlLiBXZSBjb3VsZCBub3QgZGlyZWN0bHkgdXNlIHRoZXNlIHR3byBmaWxlcyBhcyB0aGV5IGFyZSBub3Qgam9pbmVkLgoKRmlyc3QsIHdlIGhhdmUgYW5hbHlzZWQgd2hpY2ggaW5mb3JtYXRpb24gY29sdW1uIGlzIGlycmVsZXZhbnQgdG8gdGhlIG51bWJlciBvZiBpbnN0YWxscyBvZiB0aGUgYXBwLiBJdCB3YXMgZG9uZSBieSBjb21tb24gc2Vuc2UuIFdlIHJlbW92ZWQgc2l6ZSwgbGFzdCB1cGRhdGVkIGRhdGUsIGN1cnJlbnQgdmVyc2lvbiBhbmQgYW5kcm9pZCB2ZXJzaW9uIGJlY2F1c2UgdGhleSBhcmUgbm90IHRoZSBmYWN0b3IgdGhhdCB3b3VsZCBhZmZlY3QgdGhlIG51bWJlciBvZiBpbnN0YWxscyBiZWZvcmUgcHVibGlzaGluZy4gQWxzbywgd2UgaGF2ZSByZW1vdmVkIHJhdGluZyBhbmQgbnVtYmVyIG9mIHJldmlld3MgYmVjYXVzZSB0aGV5IGFyZSBvYnZpb3VzbHkgYXNzb2NpYXRlZCB3aXRoIGFwcCBpbnN0YWxscyBhbmQgdGhleSB3b3VsZCBub3QgYmUga25vd24gYmVmb3JlIHB1Ymxpc2hpbmcuIFRoZW4gd2UgYWxzbyB0cmltZWQgb3VyIGRhdGEgb2YgYW55IG91dCBvZiBwbGFjZSBjaGFyYWN0ZXJzLgoKV2UgY29tYmluZWQgQ2F0ZWdvcnkgYW5kIEdlbnJlIGJ5IGNhdGVnb3JpemluZyBjb21tb24ga2V5IHdvcmRzIGFuZCBhZGRlZCB0aGUgbGlzdCBvZiB0aGUga2V5IGNhdGVnb3JpZXMgdG8gY29sdW1ucyBzbyB0aGF0IGVhY2ggYXBw4oCZcyBhdHRyaWJ1dGVzIGZvciBjYXRlZ29yeS9nZW5yZSBjYW4gYmUgZXhwcmVzc2VkIGFzIHRydWUgb3IgZmFsc2UuIE1vcmVvdmVyLCB3ZSBoYXZlIGdhdGhlcmVkIHRoZSBzZW50aW1lbnQgb2YgcmV2aWV3cyBmb3IgZWFjaCBhcHAgYW5kIGNhbGN1bGF0ZWQgdGhlIHRvdGFsIG51bWJlciBvZiBwb3NpdGl2ZSwgbmV1dHJhbCBhbmQgbmVnYXRpdmUgdG8gZ2V0IHRoZSBwZXJjZW50YWdlcy4gRmluYWxseSB3ZSBjb21iaW5lZCB0aG9zZSB0aHJlZSBjb2x1bW5zIHRvIHRoZSBleGlzdGluZyBhcHAgbGlzdC4KCnRoaXMgd2lsbCBpbmNsdWRlIHJlbW92aW5nIGJhZCB2YWx1ZXMsIHNpcGxpdHRpbmcgYmluYXJ5IHZhbHVlcywgY2xlYW4gdGV4dCB2YWx1ZXMgYW5kIFNwbGl0IENhdGFnb3JpY2FsIFZhbHVlcywgY2xlYW4gbnVtZXJpY2FsIHZhbHVlcywgbWVhcmdpbmcgcm93cyBhbmQgRHJvcGluZyBDb2x1bW5zCgoKYGBge3J9CnN0cihnZykKYGBgCgpUaGVyZSBhcmUgYSBsb3Qgb2YgZmFjdG9yIHZhcmlhYmxlcyB3aGljaCBzaG91bGQgYWN0dWFsbHkgYmUgY29udmVydGVkIHRvIG51bWVyaWMgdmFyaWFibGVzLgoKIyMyLjEgQ29udmVydGluZyB2YXJpYWJsZSB0eXBlcyhpbXB1dGF0aW9uKQpgYGB7cn0KbGlicmFyeShsdWJyaWRhdGUpCmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KGRwbHlyKQpnZy5uZXcgPC0gZ2cgJT4lCiAgbXV0YXRlKAogICAgIyBFbGltaW5hdGUgIisiIHRvIHRyYW5zZm9ybSBJbnN0YWxscyB0byBudW1lcmljIHZhcmlhYmxlCiAgICMgSW5zdGFsbHMgPSBnc3ViKCJcXCsiLCAiIiwgYXMuY2hhcmFjdGVyKEluc3RhbGxzKSksCiAgICMgSW5zdGFsbHMgPSBhcy5udW1lcmljKGdzdWIoIiwiLCAiIiwgSW5zdGFsbHMpKSwKICAgICMgRWxpbWluYXRlICJNIiB0byB0cmFuc2Zvcm0gU2l6ZSB0byBudW1lcmljIHZhcmlhYmxlCiAgICBTaXplID0gZ3N1YigiTSIsICIiLCBTaXplKSwKICAgICMgRm9yIGNlbGxzIHdpdGggaywgZGl2aWRlIGl0IGJ5IDEwMjQsIHNpbmNlIDEwMjRrQiA9IDFNQiwgdGhlIHVuaXQgZm9yIHNpemUgaXMgTUIKICAgIFNpemUgPSBpZmVsc2UoZ3JlcGwoImsiLCBTaXplKSxhcy5udW1lcmljKGdzdWIoImsiLCAiIiwgU2l6ZSkpLzEwMjQsIGFzLm51bWVyaWMoU2l6ZSkpLAogICAgIyBUcmFuc2Zvcm0gcmV2aWV3cyB0byBudW1lcmljCiAgICBSZXZpZXdzID0gYXMubnVtZXJpYyhSZXZpZXdzKSwKICAgICMgUmVtb3ZlICIkIiBmcm9tIFByaWNlIHRvIHRyYW5zZm9ybSBpdCB0byBudW1lcmljCiAgICBQcmljZSA9IGFzLm51bWVyaWMoZ3N1YigiXFwkIiwgIiIsIGFzLmNoYXJhY3RlcihQcmljZSkpKSwKICAgICMgQ29udmVydCBMYXN0IFVwZGF0ZWQgdG8gZGF0ZSBmb3JtYXQKICAgIExhc3QuVXBkYXRlZCA9IG1keShMYXN0LlVwZGF0ZWQpLAogICAgIyBSZXBsYWNlICJWYXJpZXMgd2l0aCBkZXZpY2UiIHRvIE5BIHNpbmNlIGl0IGlzIHVua25vd24KICAgIE1pbi5BbmRyb2lkLlZlciA9IGdzdWIoIlZhcmllcyB3aXRoIGRldmljZSIsIE5BLCBBbmRyb2lkLlZlciksCiAgICAjIEtlZXAgb25seSB2ZXJzaW9uIG51bWJlciB0byAxIGRlY2ltYWwgYXMgaXQncyBtb3N0IHJlcHJlc2VudGF0aXZlCiAgICBNaW4uQW5kcm9pZC5WZXIgPSBhcy5udW1lcmljKHN1YnN0cihNaW4uQW5kcm9pZC5WZXIsIHN0YXJ0ID0gMSwgc3RvcCA9IDMpKSwKICAgICMgRHJvcCBvbGQgQW5kcm9pZCB2ZXJzaW9uIGNvbHVtbgogICAgQW5kcm9pZC5WZXIgPSBOVUxMCiAgKSAlPiUgCiAgZmlsdGVyKAogICAgIyBUd28gYXBwcyBoYWQgdHlwZSBhcyAwIG9yIE5BLCB0aGV5IHdpbGwgYmUgcmVtb3ZlZCAKICAgIFR5cGUgJWluJSBjKCJGcmVlIiwgIlBhaWQiKQogKQpgYGAKCgpgYGB7cn0Kc3RyKGdnLm5ldykKYGBgCmBgYHtyfQpvcHRpb25zKHNjaXBlbj05OTkpCnRhYmxlKGdnLm5ldyRJbnN0YWxscykKZ2cubmV3JEluc3RhbGxzJT4lc3RyKCklPiUgcHJpbnQKZ2cubmV3ICU+JSBmaWx0ZXIoSW5zdGFsbHMgPT0gIjUwMCwwMDAiKSAlPiUgcHJpbnQKYGBgCgpgYGB7cn0KbGlicmFyeShoaWdoY2hhcnRlcikKZ2cubmV3ICU+JSBzZWxlY3QoLU1pbi5BbmRyb2lkLlZlcikgJT4lIAogICAgc3VtbWFyaXNlX2FsbCgKICAgICAgICBmdW5zKHN1bShpcy5uYSguKSkpCiAgICApICU+JQogIGdhdGhlcigpICU+JQogICMgT25seSBzaG93IGNvbHVtbnMgd2l0aCBOQQogIGZpbHRlcih2YWx1ZT4gMSkgJT4lCiAgYXJyYW5nZSgtdmFsdWUpICU+JQogICAgaGNoYXJ0KCdjb2x1bW4nLCBoY2Flcyh4ID0gJ2tleScsIHkgPSAndmFsdWUnLCBjb2xvciA9ICdrZXknKSkgJT4lCiAgaGNfYWRkX3RoZW1lKGhjX3RoZW1lX2VsZW1lbnRhcnkoKSkgJT4lCiAgaGNfdGl0bGUodGV4dCA9ICJDb2x1bW5zIHdpdGggTWlzc2luZyBWYWx1ZSIpCmBgYAoKCiMjIyBNb3N0IHBvcHVsYXIgY2F0ZWdvcnkgCmBgYHtyfQpnZy5uZXcxIDwtIGdnICU+JQogIG11dGF0ZSgKICAgICMgRWxpbWluYXRlICIrIiB0byB0cmFuc2Zvcm0gSW5zdGFsbHMgdG8gbnVtZXJpYyB2YXJpYWJsZQogICAgSW5zdGFsbHMgPSBnc3ViKCJcXCsiLCAiIiwgYXMuY2hhcmFjdGVyKEluc3RhbGxzKSksCiAgICBJbnN0YWxscyA9IGFzLm51bWVyaWMoZ3N1YigiLCIsICIiLCBJbnN0YWxscykpLAogICAgIyBFbGltaW5hdGUgIk0iIHRvIHRyYW5zZm9ybSBTaXplIHRvIG51bWVyaWMgdmFyaWFibGUKICAgIFNpemUgPSBnc3ViKCJNIiwgIiIsIFNpemUpLAogICAgIyBGb3IgY2VsbHMgd2l0aCBrLCBkaXZpZGUgaXQgYnkgMTAyNCwgc2luY2UgMTAyNGtCID0gMU1CLCB0aGUgdW5pdCBmb3Igc2l6ZSBpcyBNQgogICAgU2l6ZSA9IGlmZWxzZShncmVwbCgiayIsIFNpemUpLGFzLm51bWVyaWMoZ3N1YigiayIsICIiLCBTaXplKSkvMTAyNCwgYXMubnVtZXJpYyhTaXplKSksCiAgICAjIFRyYW5zZm9ybSByZXZpZXdzIHRvIG51bWVyaWMKICAgIFJldmlld3MgPSBhcy5udW1lcmljKFJldmlld3MpLAogICAgIyBSZW1vdmUgIiQiIGZyb20gUHJpY2UgdG8gdHJhbnNmb3JtIGl0IHRvIG51bWVyaWMKICAgIFByaWNlID0gYXMubnVtZXJpYyhnc3ViKCJcXCQiLCAiIiwgYXMuY2hhcmFjdGVyKFByaWNlKSkpLAogICAgIyBDb252ZXJ0IExhc3QgVXBkYXRlZCB0byBkYXRlIGZvcm1hdAogICAgTGFzdC5VcGRhdGVkID0gbWR5KExhc3QuVXBkYXRlZCksCiAgICAjIFJlcGxhY2UgIlZhcmllcyB3aXRoIGRldmljZSIgdG8gTkEgc2luY2UgaXQgaXMgdW5rbm93bgogICAgTWluLkFuZHJvaWQuVmVyID0gZ3N1YigiVmFyaWVzIHdpdGggZGV2aWNlIiwgTkEsIEFuZHJvaWQuVmVyKSwKICAgICMgS2VlcCBvbmx5IHZlcnNpb24gbnVtYmVyIHRvIDEgZGVjaW1hbCBhcyBpdCdzIG1vc3QgcmVwcmVzZW50YXRpY2UKICAgIE1pbi5BbmRyb2lkLlZlciA9IGFzLm51bWVyaWMoc3Vic3RyKE1pbi5BbmRyb2lkLlZlciwgc3RhcnQgPSAxLCBzdG9wID0gMykpLAogICAgIyBEcm9wIG9sZCBBbmRyb2lkIHZlcnNpb24gY29sdW1uCiAgICBBbmRyb2lkLlZlciA9IE5VTEwKICApCmdnLm5ldzIgPSBnZy5uZXcxICU+JSBtdXRhdGUoSW50ZXJ2YWwgPSBkaWZmdGltZSh0aW1lMSA9IHRvZGF5KCksIHRpbWUyID0gTGFzdC5VcGRhdGVkKSkgJT4lIHByaW50CmdncGxvdChnZy5uZXcyKSArIGdlb21fbGluZShhZXMoeCA9IEludGVydmFsLCB5ID0gSW5zdGFsbHMpKSArIGxhYnMoeCA9ICJEYXlzIFNpbmNlIExhc3QgVXBkYXRlIiwgeSA9ICJJbnN0YWxsbWVudHMiKQpgYGAKCgpgYGB7cn0KZ2cubmV3MSAlPiUgCiAgZ3JvdXBfYnkoQ2F0ZWdvcnkpICU+JSBmaWx0ZXIoQ2F0ZWdvcnkgIT0gMS45KSAlPiUgCiAgc3VtbWFyaXplKAogICAgVG90YWxJbnN0YWxscyA9IHN1bShhcy5udW1lcmljKEluc3RhbGxzKSkKICApICU+JQogIGFycmFuZ2UoLVRvdGFsSW5zdGFsbHMpICU+JQogIGhjaGFydCgnc2NhdHRlcicsIGhjYWVzKHggPSAiQ2F0ZWdvcnkiLCB5ID0gIlRvdGFsSW5zdGFsbHMiLCBzaXplID0gIlRvdGFsSW5zdGFsbHMiLCBjb2xvciA9ICJDYXRlZ29yeSIpKSAlPiUKICBoY19hZGRfdGhlbWUoaGNfdGhlbWVfNTM4KCkpICU+JQogIGhjX3RpdGxlKHRleHQgPSAiTW9zdCBwb3B1bGFyIGNhdGVnb3JpZXMiKQpgYGAKCiMjI0NvcnJlbGF0aW9uIG1hcApgYGB7cn0KaGVhZChpcmlzKQpsaWJyYXJ5KHJlc2hhcGUyKQpkZl9jb3IgPSBpcmlzWywyOjNdCmNvcm1hdCA8LSByb3VuZChjb3IoZGZfY29yKSwyKSAKbWVsdGVkX2Nvcm1hdCA8LSBtZWx0KGNvcm1hdCkKZ2dwbG90KGRhdGEgPSBtZWx0ZWRfY29ybWF0LCBhZXMoVmFyMiwgVmFyMSwgZmlsbCA9IHZhbHVlKSkrCiBnZW9tX3RpbGUoY29sb3IgPSAid2hpdGUiKSsKIHNjYWxlX2ZpbGxfZ3JhZGllbnQyKGxvdyA9ICJ5ZWxsb3ciLCBoaWdoID0gInB1cnBsZSIsIG1pZCA9ICJyZWQiLAogICBtaWRwb2ludCA9IDAsIGxpbWl0ID0gYygtMSwxKSwgc3BhY2UgPSAiTGFiIiwKICAgbmFtZT0iUGVhcnNvblxuQ29ycmVsYXRpb24iKSArCiAgdGhlbWVfbWluaW1hbCgpKwogdGhlbWUoYXhpcy50ZXh0LnggPSBlbGVtZW50X3RleHQoYW5nbGUgPSA0NSwgdmp1c3QgPSAxLAogICAgc2l6ZSA9IDEyLCBoanVzdCA9IDEpKSsKIGNvb3JkX2ZpeGVkKCkKYGBgCgoKCiMjMi4yRGl2aWRlIEluc3RhbGxzIGludG8gMyBjYXRlZ29yaWVzIApgYGB7cn0KbGlicmFyeSh0aWR5dmVyc2UpCm9wdGlvbnMoc2NpcGVuPTk5OSkKIyB3cml0ZSBmdW5jdGlvbiB0byBjb252ZXJ0IGluc3RhbGxtZW50CmNvbnZlcnRfaW5zdGFsbCA9IGZ1bmN0aW9uKGRhdGEsIGluc3RhbGxtZW50KSB7CiAgI2luc3RhbGwubGV2ZWxzID0gZmFjdG9yKGMoImxvdyIsICJtZWRpdW0iLCAiaGlnaCIpKQogIAogIGlmIChpbnN0YWxsbWVudCAlaW4lIGMoIjAiLCAiMSIsICI1MCIsICIxMDAiLCAiNTAwIiwgIjEsMDAwIiwgIjUsMDAwIiwgIjEwLDAwMCIsICI1MCwwMDAiKSkgewogIEluc3RhbGxzLmNhdCA9ICJsb3ciCiAgfQogIGVsc2UgaWYgKGluc3RhbGxtZW50ICVpbiUgYyAoIjEwMCwwMDAiLCAiNTAwLDAwMCIsICIxLDAwMCwwMDAiLCAiNSwwMDAsMDAwIikpewogICAgSW5zdGFsbHMuY2F0ID0gIm1lZGl1bSIKICB9CiAgZWxzZSB7CiAgICAgIEluc3RhbGxzLmNhdCA9ICJoaWdoIgogIH0KfQojZ2cubmV3ID0gZ2cubmV3ICU+JSBmaWx0ZXIoIWlzLm5hKEluc3RhbGxzKSkgJT4lIG11dGF0ZShJbnN0YWxscy5jYXQgPSBmYWN0b3IoY29udmVydF9pbnN0YWxsKGdnLm5ldywgSW5zdGFsbHMpLCAjIGxldmVscyA9IGMoImxvdyIsICJtZWRpdW0iLCAiaGlnaCIpKSkKc3VtKChnZy5uZXckSW5zdGFsbHMpICVpbiUgIjEwLDAwMCIpCiMgZ2cubmV3ID0gZ2cubmV3ICU+JSBtdXRhdGUoSW5zdGFsbHMuY2F0ID0gIjEiKQpzdHIoZ2cubmV3KQp0YWJsZShnZy5uZXckSW5zdGFsbHMpCnRhYmxlKGdnLm5ldyRJbnN0YWxscy5jYXQpCmdnLm5ldyA9IGdnLm5ldyAlPiUgZmlsdGVyKEluc3RhbGxzICE9ICJGcmVlIikgJT4lIG11dGF0ZSgKICBJbnN0YWxscy5jYXQgPSBmY3RfY29sbGFwc2UoSW5zdGFsbHMsIAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBsb3cgPSBjKCJGcmVlIiwiMCIsICIwKyIsIjErIiwgIjUrIiwgIjEwKyIsIjEwMCsiLCAiNTArIiwgIjEwMCsiLCAiNTAwKyIsICIxLDAwMCsiLCAiNSwwMDArIiksIAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtZWRpdW0gPSBjKCIxMCwwMDArIiwgIjUwLDAwMCsiLCAiMTAwLDAwMCsiLCAiNTAwLDAwMCsiKSwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGhpZ2ggPSBjKCIxLDAwMCwwMDArIiwgIjUsMDAwLDAwMCsiLCAiMSwwMDAsMDAwLDAwMCsiLCAiMTAsMDAwLDAwMCsiLCAiMTAwLDAwMCwwMDArIiwgIjUwLDAwMCwwMDArIiwgIjUwMCwwMDAsMDAwKyIpKSkKdGFibGUoZ2cubmV3JEluc3RhbGxzLmNhdCkKYGBgCgojIyMgYm94cGxvdCBvZiBkaWZmZXJlbnQgSW5zdGFsbG1lbnQgY2F0ZWdvcmllcwpgYGB7cn0KZ2dwbG90KGRhdGEgPSBnZy5uZXcpICsKICBnZW9tX2JveHBsb3QoYWVzKHggPSByZW9yZGVyKEluc3RhbGxzLmNhdCwgLVJhdGluZyksIHkgPSBSYXRpbmcpKSArIAogIGxhYnMoeCA9ICJJbnN0YWxsbWVudCBDYXRlZ29yaWVzIix5ID0gIlJhdGluZyIpCmBgYAoKCgojIzIuMyBEZWxldGUgZHVwbGljYXRlZCByb3dzCmBgYHtyfQojIG51bWJlciBvZiBvYnNlcnZhdGlvbnMgYmVmb3JlIGRlbGV0aW5nIGR1cGxpY2F0ZWQgcm93cwoob3JpZ2luYWxfbnVtX3Jvd3MgPSBucm93KGdnLm5ldykpCmdnLm5ldy51bmlxID0gZ2cubmV3ICU+JSBkaXN0aW5jdAojIG51bWJlciBvZiByb3dzIGFmdGVyIGRlbGV0ZSBkdXBsaWNhdGVkIHJvd3MKKHVuaXFfbnVtX3Jvd3MgPSBucm93KGdnLm5ldy51bmlxKSkKIyBudW1iZXIgb2YgZHVwbGljYXRlZCByb3dzCihkdXBfcm93cyA9IG9yaWdpbmFsX251bV9yb3dzIC0gdW5pcV9udW1fcm93cykKYGBgCgojIzIuNCBNZXJnZSBDYXRlZ29yeSBpbnRvIDYgCmBgYHtyfQojIGdnLm5ldy51bmlxICU+JSBmaWx0ZXIgKCFpcy5uYShDYXRlZ29yeSkpICU+JSBwcmludApsZXZlbHMoZ2cubmV3LnVuaXEkQ2F0ZWdvcnkpCmBgYAoKYGBge3J9Cm15ZGF0YTEgPSBnZy5uZXcudW5pcSAlPiUgZmlsdGVyKENhdGVnb3J5ICE9IDEuOSkgJT4lIG11dGF0ZShDYXQuY2F0ID0gZmN0X2NvbGxhcHNlKENhdGVnb3J5LAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIEVkdWNhdGlvbiA9IGMoIkVEVUNBVElPTiIsICJCT09LU19BTkRfUkVGRVJFTkNFIiwgIkxJQlJBUklFU19BTkRfREVNTyIsICJBUlRfQU5EX0RFU0lHTiIpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIFBlcnNvbmFsaXphdGlvbiA9IGMoIlBFUlNPTkFMSVpBVElPTiIsICJCRUFVVFkiLCAiU0hPUFBJTkciLCAiREFUSU5HIiwgIlBIT1RPR1JBUEhZIiksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgTGlmZXN0eWxlID0gYygiSEVBTFRIX0FORF9GSVRORVNTIiwgIk1FRElDQUwiLCAiTElGRVNUWUxFIiwgIlNQT1JUUyIsICJGT09EX0FORF9EUklOSyIpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIEZhbWlseSA9IGMoIkZBTUlMWSIsICJQQVJFTlRJTkciLCAiSE9VU0VfQU5EX0hPTUUiLCAiMS45IiksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgRW50ZXJ0YWlubWVudCA9IGMoIkVOVEVSVEFJTk1FTlQiLCAiR0FNRSIsICJDT01JQ1MiLCAiVklERU9fUExBWUVSUyIpLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBCdXNpbmVzcyA9IGMoIkJVU0lORVNTIiwgIkZJTkFOQ0UiLCAiUFJPRFVDVElWSVRZIiwgIlRPT0xTIiwgIk5FV1NfQU5EX01BR0FaSU5FUyIsICJFVkVOVFMiLCAiU09DSUFMIiwgIkNPTU1VTklDQVRJT04iKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBUcmF2ZWwgPSBjKCJNQVBTX0FORF9OQVZJR0FUSU9OIiwgIkFVVE9fQU5EX1ZFSElDTEVTIiwgIlRSQVZFTF9BTkRfTE9DQUwiLCAiV0VBVEhFUiIpKSkKYGBgCgpgYGB7cn0KbXlkYXRhMiA9IG15ZGF0YTEgJT4lIG11dGF0ZShJbnRlcnZhbCA9IGRpZmZ0aW1lKHRpbWUxID0gdG9kYXkoKSwgdGltZTIgPSBMYXN0LlVwZGF0ZWQpKQpzdHIobXlkYXRhMikKbXlkYXRhMiAlPiUgZmlsdGVyKEluc3RhbGxzLmNhdCA9PSAibG93IikgJT4lIHByaW50CmBgYAoKIyMjIyBJbXB1dGUgbWlzc2luZyB2YWx1ZXMKYGBge3J9CiNtaXNzRm9yZXN0CmxpYnJhcnkobWlzc0ZvcmVzdCkKI2ltcHV0ZSBtaXNzaW5nIHZhbHVlcywgdXNpbmcgYWxsIHBhcmFtZXRlcnMgYXMgZGVmYXVsdCB2YWx1ZXMKZ2cubmV3LmltcCA8LSBtaXNzRm9yZXN0KGRhdGEubWF0cml4KG15ZGF0YTIpLCBtYXhpdGVyID0gNSwgbnRyZWUgPSAxMCkKI2NoZWNrIGltcHV0ZWQgdmFsdWVzCiMgZ2cubmV3LmltcCR4aW1wCiNjaGVjayBpbXB1dGF0aW9uIGVycm9yCmdnLm5ldy5pbXAkT09CZXJyb3IKYGBgClJlbW92aW5nIE91dC1sYXllcnMgCkZvciBzZW50aW1lbnQgc2NvcmVzLCBzb21lIG9mIHRoZSBkYXRhIGJlY29tZXMgb3V0LWxheWVycy4gV2UgcmVtb3ZlIHRob3NlIG91dGxheXJzIGJ5IGN1dHRpbmcgdGhlIGhlYWQgb2YgdGhlIHBvc2l0aXZlIHNlbnRpbWVudCBkaXN0cmlidXRpb24gYXQgMC4yLiBTaW1pbGFybHksIHdlIGRpZCB0aGUgc2FtZSBwcnVuaW5nIGZvciBuZXV0cmFsIGFuZCBuZWdhdGl2ZSBzZW50aW1lbnQgc2NvcmUgZGlzdHJpYnV0aW9ucyBieSBjdXR0aW5nIHRoZSB0YWlscyBvZiB0aGUgZGlzdHJpYnV0aW9uIGF0IDAuNCBhbmQgMC41IHJlc3BlY3RpdmVseS4KCiMjIyMgZ2V0IHRoZSBzZW1hbnRpYyBzY29yZQpgYGB7cn0KIyBpbnN0YWxsLnBhY2thZ2VzKCJzdHJpbmdyIikKIyBpbnN0YWxsLnBhY2thZ2VzKCJ0aWR5dGV4dCIpCmxpYnJhcnkoc3RyaW5ncikKbGlicmFyeSh0aWR5dGV4dCkKYGBgCgpgYGB7cn0KIyByZWFkIGluIHVzZXIgcmV2aWV3cwp1c2VyX3JldmlldyA9IHJlYWQuY3N2KCJnb29nbGVwbGF5c3RvcmVfdXNlcl9yZXZpZXdzLmNzdiIpCnN0cih1c2VyX3JldmlldykKdXNlcl9yZXZpZXcgJT4lIHByaW50CmhlYWQodXNlcl9yZXZpZXcpCiMgZ2V0IHNlbnRpbWVudCBkYXRhIGZyYW1lCnNlbnRzID0gZ2V0X3NlbnRpbWVudHMoImFmaW5uIikgJT4lIHByaW50CnJhbmdlKHNlbnRzJHNjb3JlKQpgYGAKCmBgYHtyfQojIGxlZnQgam9pbiB0aGUgc2VudGltZW50IGNoYXJ0IGFuZCB0aGUgdXNlciByZXZpZXdzIHRvIGdldCBzY29yZQp0MSA9IHVzZXJfcmV2aWV3ICU+JSBtdXRhdGUocmV2aWV3ID0gYXMuY2hhcmFjdGVyKFRyYW5zbGF0ZWRfUmV2aWV3KSkgJT4lIHVubmVzdF90b2tlbnMod29yZCwgcmV2aWV3KQojIHQyID0gdXNlcl9yZXZpZXdbMTo1MDAsIF0KdXNlcl9zY29yZSA9IGxlZnRfam9pbih0MSwgc2VudHMpICU+JSBncm91cF9ieShBcHApICU+JSBzdW1tYXJpc2UobiA9IG4oKSwgc2NvcmU9c3VtKHQxJHNjb3JlLCBuYS5ybT1UKSkgJT4lIG11dGF0ZShhdmcuc2NvcmUgPSBzY29yZSAvIG4pICU+JSBwcmludAojIHJhbmdlKHVzZXJfc2NvcmUgJCBhdmcuc2NvcmUpCmBgYAoKCmBgYHtyfQp1c2VyX3JldmlldyAlPiUgZ3JvdXBfYnkoQXBwKSAlPiUgY291bnQKdDExID0gdXNlcl9zY29yZSAlPiUgaW5uZXJfam9pbihnZy5uZXcpICU+JSBmaWx0ZXIoSW5zdGFsbHMgIT0gNTAwMCkgJT4lIGZpbHRlcihJbnN0YWxscyAhPSAxMDAwMDAwMDAwKQpnZ3Bsb3QodDExKSArIGdlb21fbGluZShhZXMoeCA9IEluc3RhbGxzLCB5ID0gYXZnLnNjb3JlKSkKZ2dwbG90KHQxMSkgKyBnZW9tX2JveHBsb3QoYWVzKHggPSByZW9yZGVyKGFzLmZhY3RvcihJbnN0YWxscyksIC1hdmcuc2NvcmUpLCB5ID0gYXZnLnNjb3JlKSkgKyBsYWJzKHggPSAiSW5zdGFsbG1lbnRzIiwgeSA9ICJBdmVyYWdlIFNjb3JlIikgKyBjb29yZF9mbGlwKCkKYGBgCmBgYHtyfQojIHJlY292ZXIgYXBwIG5hbWUgYWZ0ZXIgZGF0YSBpbXB1dGF0aW9uCiMgYWRkIG51bV9yb3cgdG8gZ2cubmV3Cm15ZGF0YTIgPSBteWRhdGEyICU+JSBtdXRhdGUociA9IHJvd19udW1iZXIoKSkgCiMgc3BsaXQgZGF0YSBpbnRvIHRyYWluaW5nIGFuZCB0ZXN0IGRhdGEKIyBjaGFuZ2UgdGhlIGxpc3QgdG8gZGF0YSBmcmFtZSAKZ2cuZGYgPSBnZy5uZXcuaW1wW1sxXV0gJT4lIHVubGlzdCgpCmdnLmRhdGEgPSBkYXRhLmZyYW1lKGdnLmRmKSAlPiUgbXV0YXRlKHIgPSByb3dfbnVtYmVyKCkpIAp0MSA9IGxlZnRfam9pbihnZy5kYXRhLCBteWRhdGEyLCBieSA9ICJyIikgJT4lIAogIHNlbGVjdChSYXRpbmcueCwgUmV2aWV3cy55LCBTaXplLngsIEluc3RhbGxzLmNhdC55LCBQcmljZS55LCBDb250ZW50LlJhdGluZy55LCBDYXQuY2F0LnksIEludGVydmFsLnkpICU+JSBwcmludAojIHNwbGl0IGRhdGEKKHRvdGFsX3JvdyA9IG5yb3codDEpKQppbnMubD0gd2hpY2godDEkSW5zdGFsbHMuY2F0LnkgPT0gImxvdyIpCmlucy5tPSB3aGljaCh0MSRJbnN0YWxscy5jYXQueSA9PSAibWVkaXVtIikKaW5zLmg9IHdoaWNoKHQxJEluc3RhbGxzLmNhdC55ID09ICJoaWdoIikKdHJhaW4uaWQgPSBjKHNhbXBsZShpbnMubCwgc2l6ZSA9IHRydW5jKDAuOCAqbGVuZ3RoKGlucy5sKSkpLAogICAgICAgICAgICAgc2FtcGxlKGlucy5tLCBzaXplID0gdHJ1bmMoMC44ICpsZW5ndGgoaW5zLm0pKSksIAogICAgICAgICAgICAgc2FtcGxlKGlucy5oLCBzaXplID0gdHJ1bmMoMC44ICpsZW5ndGgoaW5zLmgpKSkpCnRyYWluLmdnID0gdDFbdHJhaW4uaWQsIF0KdGVzdC5nZyA9IHQxWy10cmFpbi5pZCwgXQpsZXZlbHModHJhaW4uZ2ckYEluc3RhbGxzYCkKdGFibGUodHJhaW4uZ2ckYEluc3RhbGxzYCkKYGBgCgoKYGBge3J9CiMgcmFuZG9tIGZvcmVzdApzZXQuc2VlZCg0MTUpCmxpYnJhcnkocmFuZG9tRm9yZXN0KQp0YWJsZShmYWN0b3IodHJhaW4uZ2ckSW5zdGFsbHMuY2F0LnkpKQpiYWcuZ2c9cmFuZG9tRm9yZXN0KEluc3RhbGxzLmNhdC55fi4sIGRhdGE9dHJhaW4uZ2csIG10cnkgPSBuY29sKHRyYWluLmdnKSAtIDEsaW1wb3J0YW5jZT1UUlVFKQpiYWcuZ2cKIyBwbG90CnloYXQuYmFnID0gcHJlZGljdChiYWcuZ2csIG5ld2RhdGE9dGVzdC5nZykgCiMgdGVzdCBlcnJvcgooZm9yZXN0LnRlc3QuZXJyID0gbWVhbih5aGF0LmJhZyAhPSB0ZXN0LmdnJEluc3RhbGxzLmNhdC55KSkKIyBnZXQgdGhlIGltcG9ydGFuY2UKaW1wb3J0YW5jZShiYWcuZ2cpCnZhckltcFBsb3QoYmFnLmdnKQpgYGAKCmBgYHtyfQojIHRyZWUKc2V0LnNlZWQoNDE1KQpsaWJyYXJ5KHRyZWUpCiN0cmFpbi5nZwojY29sbmFtZXModHJhaW4uZ2cpWzFdID0gIlJhdGluZyIKI2NvbG5hbWVzKHRyYWluLmdnKVsyXSA9ICJSZXZpZXdzIgojY29sbmFtZXModHJhaW4uZ2cpWzNdID0gIlNpemUiCiNjb2xuYW1lcyh0cmFpbi5nZylbNV0gPSAiUHJpY2UiCiNjb2xuYW1lcyh0cmFpbi5nZylbNl0gPSAiQ29udGVudCBSYXRpbmciCiNjb2xuYW1lcyh0cmFpbi5nZylbN10gPSAiQ2F0ZWdvcnkiCiNjb2xuYW1lcyh0cmFpbi5nZylbMV0gPSAiVGltZSBTaW5jZSBMYXN0IFVwZGF0ZSIKI3RyYWluLmdnCnRyYWluLmdnCnRyZWUuZ2cgPSB0cmVlKEluc3RhbGxzLmNhdC55fi4sIGRhdGEgPSB0cmFpbi5nZykKc3VtbWFyeSh0cmVlLmdnKQpwbG90KHRyZWUuZ2cpCnRleHQodHJlZS5nZywgcHJldHR5ID0gMSwgY2V4ID0gMSkKeWhhdC50cmVlID0gcHJlZGljdCh0cmVlLmdnLCBuZXdkYXRhPXRlc3QuZ2cpIAojIHRlc3QgZXJyb3IKKHRyZWUudGVzdC5lcnIgPSBtZWFuKHloYXQudHJlZSAhPSB0ZXN0LmdnJEluc3RhbGxzLmNhdC55KSkKYGBgCiAKCgpgYGB7cn0KIyBwcnVuZSB0aGUgdHJlZQpjdi5nZy50cmVlPWN2LnRyZWUodHJlZS5nZyxGVU49cHJ1bmUubWlzY2xhc3MpCmN2LmdnLnRyZWUKIyBwYXIobWZyb3c9YygxLDIpKQojIHBsb3QoY3YuZ2cudHJlZSRzaXplLGN2LmdnLnRyZWUkZGV2IC8gbGVuZ3RoKHRyYWluLmdnKSx5bGFiPSJjdiBlcnJvciIsIHhsYWI9InNpemUiLHR5cGU9ImIiKQojIHBsb3QoY3YuZ2cudHJlZSRrLCBjdi5nZy50cmVlJGRldiAvIGxlbmd0aCh0cmFpbi5nZykseWxhYj0iY3YgZXJyb3IiLCB4bGFiPSJrIix0eXBlPSJiIikKIyBwcmVkaWN0IHVzaW5nIHBydW5pbmcgdHJlZQpwcnVuZS50cmVlPXBydW5lLm1pc2NsYXNzKHRyZWUuZ2csYmVzdD04KQp0cmVlLnByZWQ9cHJlZGljdChwcnVuZS50cmVlLCB0ZXN0LmdnLHR5cGU9ImNsYXNzIikKdGFibGUodHJlZS5wcmVkLCB0ZXN0LmdnJEluc3RhbGxzLmNhdC55KQoodGVzdC50cmVlLmVyciA9IG1lYW4odHJlZS5wcmVkICE9IHRlc3QuZ2ckSW5zdGFsbHMuY2F0LnkpKSAKIyBwbG90IHRoZSB0cmVlCnBsb3QocHJ1bmUudHJlZSkKdGV4dChwcnVuZS50cmVlLCBwcmV0dHkgPSAwLCBjZXggPSAxKQpgYGAKCkFzIHdlIGNhbiBzZWUgaW4gYm90aCBzaW5nbGUgdHJlZSBhbmQgcmFuZG9tIGZvcmVzdCwgcmV2aWV3cyBpcyB0aGUgbW9zdCBpbXBvcnRhbnQgcHJlZGljdG9yLiBXaGVuIHdlIGRpZyBpbnRvIHRoZSByZXZpZXdzLCB3ZSBmaWd1cmUgb3V0IHRoYXQgYXBwcm94aWFtdGVseSAxMDAwIGFwcHMgaGF2ZSBtb3JlIHRoYW4gMTAwIHJlbGV2YW50IHRleHQgcmV2aWV3cyAvIGNvbW1lbnRzLiAKCiMjIyMgU1ZNIG9uIHRyYW5pbmcgc2V0CmBgYHtyfQpzZXQuc2VlZCg0MTUpCiMgZ2V0IGRhdGEgZnJhbWUgcmVhZHkgdG8gdXNlCnRyYWluLmdnCnRhYmxlKGZhY3Rvcih0cmFpbi5nZyRJbnN0YWxscy5jYXQueSkpCmNvc3RWYWxzID0gYygxLCA1LCAxMCwgNTApCiMgbGluZWFyIGtlcm5lbAojIHJ1bm5pbmcgdG9vIHNsb3csIGJlIGNhcmVmdWwgdG8gY2hhbmdlIHByZWRpY3RvcnMKc3ZtMSA8LSB0dW5lKHN2bSwgYXMuZmFjdG9yKEluc3RhbGxzLmNhdC55KSB+IC4sIGRhdGEgPSB0cmFpbi5nZywKICAgICAgICAgICAgIGtlcm5lbCA9ICJsaW5lYXIiLAogICAgICAgICAgICAgcmFuZ2VzID0gbGlzdCgiY29zdCIgPSBjb3N0VmFscykpIApzdW1tYXJ5KHN2bTEpCiMgZmluZCB0aGUgYmVzdCBjb3N0IHVuZGVyIGxpbmVhciBrZXJuZWwKYmVzdF9tb2RfbGluZWFyID0gc3ZtMSRiZXN0Lm1vZGVsCnN1bW1hcnkoYmVzdF9tb2RfbGluZWFyKQojIHRodXMgdGhlIGNvc3Qgb2YgdGhlIGJlc3QgbW9kZWwgc2kgNTAuCmBgYAoKYGBge3J9CiMgZ2V0IHRoZSB0ZXN0IGVycm9yIG9mIHRoZSBiZXN0IG1vZGVsIG9mIHRoZSBsaW5lYXIga2VybmVsCnRlc3QuZ2cgJT4lIHN0cigpCnByZWRfdGVzdF9saW5lYXIgPSBwcmVkaWN0KGJlc3RfbW9kX2xpbmVhciwgbmV3ZGF0YSA9IHRlc3QuZ2cpCnRhYmxlKHByZWRpY3QgPSBwcmVkX3Rlc3RfbGluZWFyLCB0cnV0aCA9IHRlc3QuZ2ckSW5zdGFsbHMuY2F0LnkpCih0ZXN0X2Vycl9saW5lYXIgPSBtZWFuKHByZWRfdGVzdF9saW5lYXIgIT0gdGVzdC5nZyRJbnN0YWxscy5jYXQueSkpCmBgYAoKYGBge3J9CnNldC5zZWVkKDQxNSkKIyBrZXJuZWwgcmFkaWFsCmdhbW1hVmFscyA9IGMoMSwgMiwgMywgNCkKc3ZtX3JhZGlhbCA8LXR1bmUoc3ZtLCBhcy5mYWN0b3IoSW5zdGFsbHMuY2F0LnkpIH4gLiwgZGF0YSA9IHRyYWluLmdnLCAKICAgICAgICAgICAgICAgICAga2VybmVsID0gInJhZGlhbCIsCiAgICAgICAgICAgICAgICAgIGNvc3QgPSAxMDAsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBnYW1tYSA9Z2FtbWFWYWxzKQpzdW1tYXJ5KHN2bV9yYWRpYWwpCmBgYAoKYGBge3J9CmJlc3RfbW9kX3JhZGlhbCA9IHN2bV9yYWRpYWwkYmVzdC5tb2RlbApzdW1tYXJ5KGJlc3RfbW9kX3JhZGlhbCkKYGBgCgpgYGB7cn0KIyBnZXQgdGVzdCBlcnJvciBvZiBrZXJuZWwgb2YgdGhlIHJhZGlhbApwcmVkX3Rlc3RfcmFkaWFsID0gcHJlZGljdChiZXN0X21vZF9yYWRpYWwsIG5ld2RhdGEgPSB0ZXN0LmdnKQoodGVzdF9lcnJfcmFkaWFsID0gbWVhbihwcmVkX3Rlc3RfcmFkaWFsICE9IHRlc3QuZ2ckSW5zdGFsbHMuY2F0LnkpKQpgYGAKCgoKCgoKCklzIGl0IHRydWUgdGhhdCBwZW9wbGUgdGVuZHMgdG8gZ2l2ZSB0ZXh0IHJldmlldyB3aGVuIHRoZXkgaGlnaGx5IHBvc2l0aXZlbHkgcmV2aWV3IHRoZSBhcHA/CmBgYHtyfQojIGxlZnQgam9pbiB0aGUgdXNlcl9zY29yZSB0YWJsZSBhbmQgdDMKbXlkYXRhMiA9IG15ZGF0YTIgJT4lIG11dGF0ZShyID0gcm93X251bWJlcigpKSAlPiUgcHJpbnQgCmdnLmRmID0gZ2cubmV3LmltcFtbMV1dICU+JSB1bmxpc3QoKQpnZy5kYXRhID0gZGF0YS5mcmFtZShnZy5kZikgJT4lIG11dGF0ZShyID0gcm93X251bWJlcigpKSAlPiUgcHJpbnQKdDMgPSBsZWZ0X2pvaW4oZ2cuZGF0YSwgbXlkYXRhMiwgYnkgPSAiciIpICU+JSAKICBzZWxlY3QoUmF0aW5nLngsIFJldmlld3MueSwgQXBwLnksIEluc3RhbGxzLmNhdC55KSAlPiUgcHJpbnQKY29sbmFtZXModDMpWzNdID0gIkFwcCIKdDIgPSBpbm5lcl9qb2luKHVzZXJfc2NvcmUsIHQzLCBieSA9ICJBcHAiKSAlPiUgcHJpbnQKIyByYWluZyBhbmQgYXZnIHNjb3JlCiMgYWRkIG1haW4gdGl0bGUgbWFudWFsbHksIHdoaWNoIGlzICJyYXRpbmcgdnMgYWF2ZXJhZ2Ugc2VudGltZW50YWwgc2NvcmUiCmdncGxvdChkYXRhID0gdDIsIGFlcyh4ID0gUmF0aW5nLngsIHkgPSBhdmcuc2NvcmUpKSArIGdlb21fYmFyKHN0YXQgPSAiaWRlbnRpdHkiKSArIGxhYnMoeCA9ICJSYXRpbmciLCB5ID0gIkF2ZXJhZ2UgU2VudGltZW50YWwgU2NvcmUiLCB0aXRsZSA9ICJSYXRpbmcgdnMgQXZlcmFnZSBzZW50aW1lbnRhbCBTY29yZSIpIApnZ3Bsb3QoZGF0YSA9IHQyLCBhZXMoeCA9IGFzLmZhY3RvcihJbnN0YWxscy5jYXQueSksIHkgPSBhdmcuc2NvcmUpKSArIGdlb21fYm94cGxvdCgpICsgbGFicyh4ID0gIkluc3RhbGxtZW50IENhdGVnb3J5IiwgeSA9ICJBdmVyYWdlIFNlbnRpbWVudGFsIFNjb3JlIikKI2JveHBsb3QodDIkSW5zdGFsbHMuY2F0LnkgfiB0MiRhdmcuc2NvcmUpCiMgcmF0aW5nIHZzIHJldmlld3MKZ2dwbG90KGRhdGEgPSB0MiwgYWVzKHggPSBSZXZpZXdzLnksIHkgPSBhdmcuc2NvcmUpKSArIGdlb21fYmFyKHN0YXQgPSAiaWRlbnRpdHkiKSArIGxhYnMoeCA9ICJOdW1iZXIgb2YgI1Jldmlld3MiLCB5ID0gIkF2ZXJhZ2UgU2VudGltZW50YWwgU2NvcmUiLCB0aXRsZSA9ICJOdW1iZXIgb2YgUmV2aWV3cyB2cyBBdmVyYWdlIHNlbnRpbWVudGFsIFNjb3JlIikgCmBgYAoKSGlnaCBhdmcgc2NvcmUgdGVuZHMgdG8gY29uY2VudHJhdGVkIGF0IHJhdGluZyBhYm92ZSBhbmQgaW5jbHVkaW5nIDQuMAoKCgoKCgoKCgoKIyMjIyBkYXRhIGZyYW1lIHRoYXQgbWlnaHQgbm90IGJlIHVzZWQKYGBge3J9CmZpbmFsMSA9IGxlZnRfam9pbihnZy5kYXRhLCBteWRhdGEyLCBieSA9ICJyIikgJT4lIHNlbGVjdChBcHAueSwgUmV2aWV3cy55LCBSYXRpbmcueCwgSW50ZXJ2YWwueSwgU2l6ZS54LCBQcmljZS55LCBDYXQuY2F0LnksIENvbnRlbnQuUmF0aW5nLnkpICU+JSBwcmludApjb2xuYW1lcyhmaW5hbDEpWzFdID0gIkFwcCIKY29sbmFtZXMoZmluYWwxKVsyXSA9ICJSZXZpZXdzIgpjb2xuYW1lcyhmaW5hbDEpWzNdID0gIlJhdGluZyIKY29sbmFtZXMoZmluYWwxKVs0XSA9ICJJbnRlcnZhbCIKY29sbmFtZXMoZmluYWwxKVs1XSA9ICJTaXplIgpjb2xuYW1lcyhmaW5hbDEpWzZdID0gIlByaWNlIgpjb2xuYW1lcyhmaW5hbDEpWzddID0gIkNhdGVnb3J5Igpjb2xuYW1lcyhmaW5hbDEpWzhdID0gIkNvbnRlbnQiCnNob3coKGZpbmFsMSkpCnBsb3QoZmluYWwxKQpgYGAKVGhpcyBwcm9ibGVtIHdhcyB2ZXJ5IGRpZmZpY3VsdCBhcyBpdCBzZWVtcyBub3RoaW5nIHdlIGRpZCB3YXMgYWJsZSB0byBpbmNyZWFzZSB0aGUgc2NvcmUuIFRoaXMgaXMgbGlrZWx5IGJlY2F1c2UgdGhlIHBlb3BsZSBkb27igJl0IHJlYWxseSBrbm93IHdoYXQgdGhleSB3YW50IGFuZCBvZnRlbiB0aGUgYmVzdCBwcm9kdWN0IHN1Y2ggYXMgdGhlIG9uZSB3aXRoIHRoZSBiZXN0IHJldmlld3MgZG9lc24ndCBhY3R1YWxseSBkbyB0aGF0IHdlbGwuIFdlIGNhbiBzZWUgZnJvbSB0aGUgc2VudGltZW50IGdyYXBoaW5nIHRoYXQgdGhlcmUgc2VlbWVkIG5vdCB0byBiZSBhbnkgbWFqb3IgY29ycmVsYXRpb24gYmV0d2VlbiBnb29kIHJldmlld3MgYW5kIGluc3RhbGxzLiBBIHBvc3NpYmxlIHdheSB0aGlzIHByb2plY3QgY291bGQgaGF2ZSBiZWVuIG1vcmUgc3VjY2Vzc2Z1bCB3YXMgYW4gZWFybHkgZm9jdXMgb24gY2hhbmdpbmcgdG8gYSByZWdyZXNzaW9uIHByb2JsZW0gYXMgd2hpbGUgdGhlIGRhdGEgd2hpbGUgZ3JvdXBlZCBjb3VsZCBiZSB0dXJuZWQgYmFjayBpbnRvIGEgbnVtYmVyLiBJbiB0aGUgZW5kIHRoZSBiZXN0IHdlIGNvdWxkIGRvIHdpdGggYWxsIHRoZSBjbGFzc2VzIHdhcyA1NiUgd2l0aCB0aGUgTmV0d29yayBhbmQgdGhlIGJlc3QgdHJlZSB3YXMgNDIlLiBXaXRoIHRoZSByZWR1Y2VkIGNsYXNzZXMgdGhlIGJlc3Qgd2UgY291bGQgYWNoaWV2ZSB3YXMgNjIlIHdpdGggdGhlIEFOTiBhbmQgNTIlIHdpdGggdGhlIFRyZWUuIE92ZXJhbGwgZ2V0dGluZyBldmVuIDYyJSBzZWVtcyB1bnN1Y2Nlc3NmdWwgaXQgaXMgbm90IHZlcnkgZmFyIGZyb20ganVzdCBtYWpvcml0eSBndWVzc2luZy4gQnV0IEkgc3VzcGVjdCBpdCB3b3VsZCBiZSBkaWZmaWN1bHQgdG8gYWNoaWV2ZSBoaWdoZXIgcmVzdWx0czsgbm9uZSBvZiB0aGUgYXR0cmlidXRlcyBzZWVtcyB0byBiZSBhYmxlIHRvIHByZWRpY3QgdGhlIGluc3RhbGxzIHdlbGwuIEhvd2V2ZXIgY2FsbGluZyB0aGUgcHJvamVjdCBhIGNvbXBsZXRlIGZhaWx1cmUgaXMgbm90IGNvcnJlY3QgYXMgd2Ugd2VyZSBhYmxlIHRvIGZpbmQgc29tZSBtZXRob2RzIHRvIGltcHJvdmUgdGhlIGFjY3VyYWN5IHRocm91Z2ggY29sdW1uIHNlbGVjdGlvbiBhbmQgd2hpbGUgd2UgbWFkZSB0aGUgcHJvYmxlbSBlYXNpZXIgYnkgcmVkdWNpbmcgdGhlIG51bWJlciBvZiBjbGFzcy5pdCBiZWNhbWUgbW9yZSBjb25zaXN0ZW50LiBhbiBlYXNpZXIgcHJvYmxlbSB3aXRoIGhpZ2hlciBzdWNjZXNzIHdvdWxkIGJlIGEgc2FmZXIgYmV0IGZvciB0aG9zZSBsb29raW5nIHRvIGludmVzdC4=