Setup
Loading the packages required for the assignment
library(readr)
library(tidyr)
library(dplyr)
library(Hmisc)
library(outliers)
library(kableExtra)
library(knitr)
Read WHO Data
WHO data is read from WHO.csv using the read_csv() function of the readr package and stored as a data frame named who_data.
who_data <- read_csv("WHO.csv")
dim(who_data)
[1] 7240 60
Tidy Task 1:
Tidying data using gather():
The WHO data set is not in tidy format. It is in wide format. Here we use the tidyr function gather() to reshape the data set into long format.
who_data <- who_data %>%
gather(c(5:60), key = "code", value = "value" )
The data set after reshape:
who_data
Dimensions of the dataset:
dim(who_data)
[1] 405440 6
Tidy Task 2:
Separating the code column:
The code column contains the information of four different variables. This information can be split into 4 more columns. In this task we separate the code column into new, var, sex, and age columns.
after_tidy_2 <- who_data %>%
# separating the columns based on the separator '_'
separate(code, into = c("new", "var", "sex_age"), sep = c("_")) %>%
# sex and age information doesn't have a separator, this is separated after the first charachter (m/f)
separate(sex_age, into = c("sex", "age"), 1)
The dataset after separating the code column:
after_tidy_2
Dimensions of the dataset:
dim(after_tidy_2)
[1] 405440 9
Tidy Task 3:
Tidying var column using spread():
The var column contains four keys - rel, ep, sn, and sp. Each of these keys can be a separate variable (column). We use spread() to place these keys into their own variables with the values from the value variable.
after_tidy_3 <- after_tidy_2 %>% spread(key = var, value = value)
The dataset after spreading the var column:
after_tidy_3
Structure of the dataset:
str(after_tidy_3)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 101360 obs. of 11 variables:
$ country: chr "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
$ iso2 : chr "AF" "AF" "AF" "AF" ...
$ iso3 : chr "AFG" "AFG" "AFG" "AFG" ...
$ year : num 1980 1981 1982 1983 1984 ...
$ new : chr "new" "new" "new" "new" ...
$ sex : chr "m" "m" "m" "m" ...
$ age : chr "014" "014" "014" "014" ...
$ ep : num NA NA NA NA NA NA NA NA NA NA ...
$ rel : num NA NA NA NA NA NA NA NA NA NA ...
$ sn : num NA NA NA NA NA NA NA NA NA NA ...
$ sp : num NA NA NA NA NA NA NA NA NA NA ...
Dimensions of the dataset:
dim(after_tidy_3)
[1] 101360 11
Tidy Task 4:
Factoring sex and age variables using mutate() and factor():
The sex and age variables are categorical and need to be factorised. We use mutate() to factorise the two variables. Here we create appropriate labels for each level in the age variable.
after_tidy_4 <- after_tidy_3 %>%
mutate(sex = factor(sex, levels = c("m","f")),
age = factor(age,
levels = c("014", "1524", "2534", "3544", "4554", "5564", "65"),
labels = c("<15", "15-24", "25-34", "35-44", "45-54", "55-64", "65>="),
ordered = TRUE
)
)
Levels of the sex and age varaibles after factoring:
levels(after_tidy_4$sex)
[1] "m" "f"
levels(after_tidy_4$age)
[1] "<15" "15-24" "25-34" "35-44" "45-54" "55-64" "65>="
Structure of the sex and age variables after factoring:
str(after_tidy_4$sex)
Factor w/ 2 levels "m","f": 1 1 1 1 1 1 1 1 1 1 ...
str(after_tidy_4$age)
Ord.factor w/ 7 levels "<15"<"15-24"<..: 1 1 1 1 1 1 1 1 1 1 ...
The dataset after factoring the sex and age variables:
after_tidy_4
Dimensions of the dataset:
dim(after_tidy_4)
[1] 101360 11
Task 5: Filter & Select
Dropping the iso2 and new columns and filtering three countries:
The columns iso2 and new columns are redundant and hence are dropped. After this, the dataset is filtered to obtain data for three countries - India, Australia, United Arab Emirates. The filtered dataset is saved as WHO_subset.
WHO_subset <- after_tidy_4 %>%
select(-c("iso2", "new")) %>%
filter(country == "India" | country == "Australia" | country == "United Arab Emirates" )
The dataset after dropping and filtering:
WHO_subset
The following code shows that WHO_subset has the filtered data for only 3 countries:
unique(WHO_subset$country)
[1] "Australia" "India" "United Arab Emirates"
Dimensions of the dataset:
dim(WHO_subset)
[1] 1428 9
Read Species and Surveys data sets
The species and surveys data sets are read from species.csv and surveys.csv using the read_csv() function of the readr package and stored as data frames named species and surveys respectively.
species <- read_csv("species.csv")
surveys <- read_csv("surveys.csv")
Task 6: Join
Combining surveys and species data:
The datasets surveys and species are joined by the key variable species_id. We use left_join() function to add the species variables (genus,taxa,species) to surveys data and save this as a new dataframe surveys_combined.
surveys_combined <- left_join(surveys, species, by = "species_id")
surveys_combined
To compare, here are the Dimensions of the datasets:
dim(species)
[1] 54 4
dim(surveys)
[1] 35549 8
dim(surveys_combined)
[1] 35549 11
Task 8: Missing Values
Filtering the surveys_combined data:
We filter the surveys_combined data for the year 1997 and save it as the data set surveys_combined_year.
surveys_combined_year <- surveys_combined %>%
filter(year == 1997)
Counting the NA values in weight variable for each species:
The surveys_combined_year data is grouped by species and the number of missing values (NA) is calculated.
surveys_combined_year %>%
group_by(species) %>%
summarise(`NA Count` = sum(is.na(weight))) %>%
filter(`NA Count` != 0)
The output above shows the 13 species which have missing (NA) values, along with the count of the corresponding NA values. The 3 remaining species not displayed here are spilosoma - 12, torridus - 2 and NA - 1
Imputing the NA (missing) values:
Now we impute the NA values with the mean weight for each species using mutate(). We save the imputed data as surveys_weight_imputed.
# group the data by species and impute the NA values with the mean of each species
surveys_weight_imputed <- surveys_combined_year %>%
group_by(species) %>%
mutate(weight = ifelse(is.na(weight), mean(weight, na.rm=TRUE), weight))
The code above is written by referring to the code from Stackoverflow: Link to code
Checking if the imputation was successful in the surveys_weight_imputed data:
surveys_weight_imputed %>%
group_by(species) %>%
summarise(`NA Count` = sum(is.na(weight))) %>%
filter(`NA Count` != 0)
From the output above, we can observe that the surveys_weight_imputed still has few species with missing values. But we can note that some of the missing values have been imputed sucessfully. Of the 13 species with missing values, the imputation succesfully imputed the missing values for 8 species. The remaining 5 species still show missing values. We inspect and explain why this happens in the next task.
Task 9: Special Values
Checking for special values in weight column in surveys_weight_imputed:
sum(sapply(surveys_weight_imputed$weight, function(x) (is.infinite(x) | is.nan(x) )) )
[1] 54
Here are the number of special values before we imputed the surveys_combined_year data:
sum(sapply(surveys_combined_year$weight, function(x) (is.infinite(x) | is.nan(x) )) )
[1] 0
Observing the outputs above, we can conclude that the special values were generated as a result of the imputation. Also the function - is.na() counts NA as well as NaN values, which is why we got the count for NA values after imputation. In reality, we were getting the count of NaN values since all the NA values were imputed. We inspect the reason for the NaN values showing up in the data below.
Examining the data for a species that showed missing values after imputation:
filter(surveys_weight_imputed, species == "harrisi")
All the special values in the weight column for species = harrisi are NaN values. We get NaN values because for some species such as harrisi, the weight column has only NA values. When we calculate the mean for these species, to impute the NA values (as performed in task 8) we use the parameter na.rm = TRUE to ignore NA values when calculating the mean. For species such as harrisi, when calculating the mean, all the values are ignored restulting in NaN being returned as the mean. This caused NaN values to be imputed in the weight column.
Consider the following example to understand how the NaN values were created:
mean(c(NA, NA), na.rm = TRUE)
[1] NaN
When the mean is calculated for a vector with only NAs, na.rm = TRUE ignores all the NA values, hence resulting in the denominator of the mean to be zero, which evaluates to NaN.
Task 10: Outliers
LS0tCnRpdGxlOiAiTUFUSDIzNDkgU2VtZXN0ZXIgMiwgMjAxOSIKYXV0aG9yOgotIE1hYXogU2hhaWtoIC0gUzM3OTU2MDMKLSBWYWlzaG5hdmkgTmFyYXlhbmEgTmFpayAtIFMzNzk3NDQyCnN1YnRpdGxlOiBBc3NpZ25tZW50IDIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKLS0tCgo8aHI+CgojIyBTZXR1cAoKIyMjIyMgTG9hZGluZyB0aGUgcGFja2FnZXMgcmVxdWlyZWQgZm9yIHRoZSBhc3NpZ25tZW50CgpgYGB7ciAsIGVjaG8gPSBUUlVFLCBtZXNzYWdlPUZBTFNFfQoKbGlicmFyeShyZWFkcikKbGlicmFyeSh0aWR5cikKbGlicmFyeShkcGx5cikKbGlicmFyeShIbWlzYykKbGlicmFyeShvdXRsaWVycykKbGlicmFyeShrYWJsZUV4dHJhKQpsaWJyYXJ5KGtuaXRyKQpgYGAKCiMjIFJlYWQgV0hPIERhdGEKCldITyBkYXRhIGlzIHJlYWQgZnJvbSBgV0hPLmNzdmAgdXNpbmcgdGhlIGByZWFkX2NzdigpYCBmdW5jdGlvbiBvZiB0aGUgYHJlYWRyYCBwYWNrYWdlIGFuZCBzdG9yZWQgYXMgYSBkYXRhIGZyYW1lIG5hbWVkIGB3aG9fZGF0YWAuCgpgYGB7ciwgZWNobyA9IFRSVUUsIG1lc3NhZ2UgPSBGQUxTRX0KCndob19kYXRhIDwtIHJlYWRfY3N2KCJXSE8uY3N2IikKZGltKHdob19kYXRhKQoKYGBgCgoKIyMgVGlkeSBUYXNrIDE6CgojIyMjIFRpZHlpbmcgZGF0YSB1c2luZyBgZ2F0aGVyKClgOgpUaGUgV0hPIGRhdGEgc2V0IGlzIG5vdCBpbiB0aWR5IGZvcm1hdC4gSXQgaXMgaW4gd2lkZSBmb3JtYXQuIEhlcmUgd2UgdXNlIHRoZSBgdGlkeXJgIGZ1bmN0aW9uIGBnYXRoZXIoKWAgdG8gcmVzaGFwZSB0aGUgZGF0YSBzZXQgaW50byBsb25nIGZvcm1hdC4KYGBge3IsIGVjaG89VFJVRX0Kd2hvX2RhdGEgPC0gd2hvX2RhdGEgJT4lCiAgZ2F0aGVyKGMoNTo2MCksIGtleSA9ICJjb2RlIiwgdmFsdWUgPSAidmFsdWUiICkKYGBgCgo8YnI+CgojIyMjIyBfX1RoZSBkYXRhIHNldCBhZnRlciByZXNoYXBlOl9fCmBgYHtyfQp3aG9fZGF0YQpgYGAKCjxicj4KCiMjIyMjIF9fRGltZW5zaW9ucyBvZiB0aGUgZGF0YXNldDpfXwpgYGB7cn0KZGltKHdob19kYXRhKQoKYGBgCgo8aHI+CgojIyBUaWR5IFRhc2sgMjoKCiMjIyMgU2VwYXJhdGluZyB0aGUgYGNvZGVgIGNvbHVtbjogCgpUaGUgYGNvZGVgIGNvbHVtbiBjb250YWlucyB0aGUgaW5mb3JtYXRpb24gb2YgZm91ciBkaWZmZXJlbnQgdmFyaWFibGVzLiBUaGlzIGluZm9ybWF0aW9uIGNhbiBiZSBzcGxpdCBpbnRvIDQgbW9yZSBjb2x1bW5zLgpJbiB0aGlzIHRhc2sgd2Ugc2VwYXJhdGUgdGhlIGBjb2RlYCBjb2x1bW4gaW50byBgbmV3YCwgYHZhcmAsIGBzZXhgLCBhbmQgYGFnZWAgY29sdW1ucy4KCmBgYHtyfQphZnRlcl90aWR5XzIgPC0gd2hvX2RhdGEgJT4lCiAgIyBzZXBhcmF0aW5nIHRoZSBjb2x1bW5zIGJhc2VkIG9uIHRoZSBzZXBhcmF0b3IgJ18nCiAgc2VwYXJhdGUoY29kZSwgaW50byA9IGMoIm5ldyIsICJ2YXIiLCAic2V4X2FnZSIpLCBzZXAgPSBjKCJfIikpICU+JSAKICAgICMgc2V4IGFuZCBhZ2UgaW5mb3JtYXRpb24gZG9lc24ndCBoYXZlIGEgc2VwYXJhdG9yLCB0aGlzIGlzIHNlcGFyYXRlZCBhZnRlciB0aGUgZmlyc3QgY2hhcmFjaHRlciAobS9mKQogICAgc2VwYXJhdGUoc2V4X2FnZSwgaW50byA9IGMoInNleCIsICJhZ2UiKSwgMSkKYGBgCgo8YnI+CgojIyMjIyBUaGUgZGF0YXNldCBhZnRlciBzZXBhcmF0aW5nIHRoZSBgY29kZWAgY29sdW1uOgpgYGB7cn0KYWZ0ZXJfdGlkeV8yCmBgYAoKPGJyPgoKIyMjIyMgX19EaW1lbnNpb25zIG9mIHRoZSBkYXRhc2V0Ol9fCmBgYHtyfQpkaW0oYWZ0ZXJfdGlkeV8yKQpgYGAKCjxocj4KCiMjIFRpZHkgVGFzayAzOgoKIyMjIyBUaWR5aW5nIGB2YXJgIGNvbHVtbiB1c2luZyBgc3ByZWFkKClgOgpUaGUgYHZhcmAgY29sdW1uIGNvbnRhaW5zIGZvdXIga2V5cyAtIGByZWxgLCBgZXBgLCBgc25gLCBhbmQgYHNwYC4gRWFjaCBvZiB0aGVzZSBrZXlzIGNhbiBiZSBhIHNlcGFyYXRlIHZhcmlhYmxlIChjb2x1bW4pLiBXZSB1c2UgYHNwcmVhZCgpYCB0byBwbGFjZSB0aGVzZSBrZXlzIGludG8gdGhlaXIgb3duIHZhcmlhYmxlcyB3aXRoIHRoZSB2YWx1ZXMgZnJvbSB0aGUgYHZhbHVlYCB2YXJpYWJsZS4gCmBgYHtyfQphZnRlcl90aWR5XzMgPC0gYWZ0ZXJfdGlkeV8yICU+JSBzcHJlYWQoa2V5ID0gdmFyLCB2YWx1ZSA9IHZhbHVlKQpgYGAKCjxicj4KCiMjIyMjIF9fVGhlIGRhdGFzZXQgYWZ0ZXIgc3ByZWFkaW5nIHRoZSBgdmFyYCBjb2x1bW46X18KYGBge3J9CmFmdGVyX3RpZHlfMwpgYGAKCjxicj4KCiMjIyMjIF9fU3RydWN0dXJlIG9mIHRoZSBkYXRhc2V0Ol9fCmBgYHtyfQpzdHIoYWZ0ZXJfdGlkeV8zKQpgYGAKCjxicj4KCiMjIyMjIF9fRGltZW5zaW9ucyBvZiB0aGUgZGF0YXNldDpfXwpgYGB7cn0KZGltKGFmdGVyX3RpZHlfMykKYGBgCgo8aHI+CgojIyBUaWR5IFRhc2sgNDogCgojIyMjIEZhY3RvcmluZyBgc2V4YCBhbmQgYGFnZWAgdmFyaWFibGVzIHVzaW5nIGBtdXRhdGUoKWAgYW5kIGBmYWN0b3IoKWA6ClRoZSBgc2V4YCBhbmQgYGFnZWAgdmFyaWFibGVzIGFyZSBjYXRlZ29yaWNhbCBhbmQgbmVlZCB0byBiZSBmYWN0b3Jpc2VkLiBXZSB1c2UgYG11dGF0ZSgpYCB0byBmYWN0b3Jpc2UgdGhlIHR3byB2YXJpYWJsZXMuIEhlcmUgd2UgY3JlYXRlIGFwcHJvcHJpYXRlIGxhYmVscyBmb3IgZWFjaCBsZXZlbCBpbiB0aGUgYWdlIHZhcmlhYmxlLiAKYGBge3J9CmFmdGVyX3RpZHlfNCA8LSBhZnRlcl90aWR5XzMgJT4lCiAgbXV0YXRlKHNleCA9IGZhY3RvcihzZXgsIGxldmVscyA9IGMoIm0iLCJmIikpLAogICAgICAgICBhZ2UgPSBmYWN0b3IoYWdlLCAKICAgICAgICAgICAgICAgICAgICAgIGxldmVscyA9IGMoIjAxNCIsICIxNTI0IiwgIjI1MzQiLCAiMzU0NCIsICI0NTU0IiwgIjU1NjQiLCAiNjUiKSwKICAgICAgICAgICAgICAgICAgICAgIGxhYmVscyA9IGMoIjwxNSIsICIxNS0yNCIsICIyNS0zNCIsICIzNS00NCIsICI0NS01NCIsICI1NS02NCIsICI2NT49IiksCiAgICAgICAgICAgICAgICAgICAgICBvcmRlcmVkID0gVFJVRQogICAgICAgICAgICAgICAgICAgICAgKQogICAgICAgICApCmBgYAoKPGJyPgoKIyMjIyMgX19MZXZlbHMgb2YgdGhlIGBzZXhgIGFuZCBgYWdlYCB2YXJhaWJsZXMgYWZ0ZXIgZmFjdG9yaW5nOl9fCmBgYHtyfQpsZXZlbHMoYWZ0ZXJfdGlkeV80JHNleCkKbGV2ZWxzKGFmdGVyX3RpZHlfNCRhZ2UpCmBgYAoKPGJyPgoKIyMjIyMgX19TdHJ1Y3R1cmUgb2YgdGhlIGBzZXhgIGFuZCBgYWdlYCB2YXJpYWJsZXMgYWZ0ZXIgZmFjdG9yaW5nOl9fCmBgYHtyfQpzdHIoYWZ0ZXJfdGlkeV80JHNleCkKc3RyKGFmdGVyX3RpZHlfNCRhZ2UpCmBgYAoKPGJyPgoKIyMjIyMgX19UaGUgZGF0YXNldCBhZnRlciBmYWN0b3JpbmcgdGhlIGBzZXhgIGFuZCBgYWdlYCB2YXJpYWJsZXM6X18KYGBge3J9CmFmdGVyX3RpZHlfNApgYGAKCjxicj4KCiMjIyMjIF9fRGltZW5zaW9ucyBvZiB0aGUgZGF0YXNldDpfXwpgYGB7cn0KZGltKGFmdGVyX3RpZHlfNCkKYGBgCgo8aHI+CgojIyBUYXNrIDU6IEZpbHRlciAmIFNlbGVjdAoKIyMjIyBEcm9wcGluZyB0aGUgYGlzbzJgIGFuZCBgbmV3YCBjb2x1bW5zIGFuZCBmaWx0ZXJpbmcgdGhyZWUgY291bnRyaWVzOgpUaGUgY29sdW1ucyBgaXNvMmAgYW5kIGBuZXdgIGNvbHVtbnMgYXJlIHJlZHVuZGFudCBhbmQgaGVuY2UgYXJlIGRyb3BwZWQuIEFmdGVyIHRoaXMsIHRoZSBkYXRhc2V0IGlzIGZpbHRlcmVkIHRvIG9idGFpbiBkYXRhIGZvciB0aHJlZSBjb3VudHJpZXMgLSBgSW5kaWFgLCBgQXVzdHJhbGlhYCwgYFVuaXRlZCBBcmFiIEVtaXJhdGVzYC4gVGhlIGZpbHRlcmVkIGRhdGFzZXQgaXMgc2F2ZWQgYXMgYFdIT19zdWJzZXRgLgpgYGB7cn0KV0hPX3N1YnNldCA8LSBhZnRlcl90aWR5XzQgJT4lCiAgc2VsZWN0KC1jKCJpc28yIiwgIm5ldyIpKSAlPiUKICBmaWx0ZXIoY291bnRyeSA9PSAiSW5kaWEiIHwgY291bnRyeSA9PSAiQXVzdHJhbGlhIiB8IGNvdW50cnkgPT0gIlVuaXRlZCBBcmFiIEVtaXJhdGVzIiApCmBgYAoKPGJyPgoKIyMjIyMgX19UaGUgZGF0YXNldCBhZnRlciBkcm9wcGluZyBhbmQgZmlsdGVyaW5nOl9fCmBgYHtyfQpXSE9fc3Vic2V0CmBgYAoKPGJyPgoKIyMjIyMgVGhlIGZvbGxvd2luZyBjb2RlIHNob3dzIHRoYXQgYFdIT19zdWJzZXRgIGhhcyB0aGUgZmlsdGVyZWQgZGF0YSBmb3Igb25seSAzIGNvdW50cmllczoKYGBge3J9CnVuaXF1ZShXSE9fc3Vic2V0JGNvdW50cnkpCmBgYAoKPGJyPgoKIyMjIyMgX19EaW1lbnNpb25zIG9mIHRoZSBkYXRhc2V0Ol9fCmBgYHtyfQpkaW0oV0hPX3N1YnNldCkKYGBgCgo8aHI+CgojIyBSZWFkIFNwZWNpZXMgYW5kIFN1cnZleXMgZGF0YSBzZXRzCgpUaGUgYHNwZWNpZXNgIGFuZCBgc3VydmV5c2AgZGF0YSBzZXRzIGFyZSByZWFkIGZyb20gYHNwZWNpZXMuY3N2YCBhbmQgYHN1cnZleXMuY3N2YCB1c2luZyB0aGUgYHJlYWRfY3N2KClgIGZ1bmN0aW9uIG9mIHRoZSBgcmVhZHJgIHBhY2thZ2UgYW5kIHN0b3JlZCBhcyBkYXRhIGZyYW1lcyBuYW1lZCBgc3BlY2llc2AgYW5kIGBzdXJ2ZXlzYCByZXNwZWN0aXZlbHkuCgpgYGB7ciwgZWNobz1UUlVFLCBtZXNzYWdlPUZBTFNFfQpzcGVjaWVzIDwtIHJlYWRfY3N2KCJzcGVjaWVzLmNzdiIpCnN1cnZleXMgPC0gcmVhZF9jc3YoInN1cnZleXMuY3N2IikKYGBgCgo8aHI+CgojIyBUYXNrIDY6IEpvaW4gIAoKIyMjIyBDb21iaW5pbmcgYHN1cnZleXNgIGFuZCBgc3BlY2llc2AgZGF0YToKClRoZSBkYXRhc2V0cyBgc3VydmV5c2AgYW5kIGBzcGVjaWVzYCBhcmUgam9pbmVkIGJ5IHRoZSBrZXkgdmFyaWFibGUgYHNwZWNpZXNfaWRgLiBXZSB1c2UgYGxlZnRfam9pbigpYCBmdW5jdGlvbiB0byBhZGQgdGhlIHNwZWNpZXMgdmFyaWFibGVzIChgZ2VudXNgLGB0YXhhYCxgc3BlY2llc2ApIHRvIGBzdXJ2ZXlzYCBkYXRhIGFuZCBzYXZlIHRoaXMgYXMgYSBuZXcgZGF0YWZyYW1lIGBzdXJ2ZXlzX2NvbWJpbmVkYC4KCmBgYHtyfQpzdXJ2ZXlzX2NvbWJpbmVkIDwtIGxlZnRfam9pbihzdXJ2ZXlzLCBzcGVjaWVzLCBieSA9ICJzcGVjaWVzX2lkIikKc3VydmV5c19jb21iaW5lZApgYGAKCjxicj4KCiMjIyMjIF9fVG8gY29tcGFyZSwgaGVyZSBhcmUgdGhlIERpbWVuc2lvbnMgb2YgdGhlIGRhdGFzZXRzOl9fCgpgYGB7cn0KZGltKHNwZWNpZXMpCmRpbShzdXJ2ZXlzKQpkaW0oc3VydmV5c19jb21iaW5lZCkKCmBgYAoKPGhyPgoKIyMgVGFzayA3OiBDYWxjdWxhdGUgCgojIyMjIENhbGN1bGF0aW5nIGF2ZXJhZ2UgYHdlaWdodGAgYW5kIGBoaW5kZm9vdCBsZW5ndGhgIGZvciBvbmUgc3BlY2llcyAtIGBhbGJpZ3VsYWA6CgpXZSBmaXJzdCBmaWx0ZXIgdGhlIGBzdXJ2ZXlzX2NvbWJpbmVkYCBkYXRhIHNldCBmb3IgdGhlIHNwZWNpZXMgXydhbGJpZ3VsYSdfIGFuZCBncm91cCBpdCBieSBgbW9udGhgLgpgYGB7cn0KYWxiaWd1bGFfZGF0YSA8LSBzdXJ2ZXlzX2NvbWJpbmVkICU+JQogIGZpbHRlcihzcGVjaWVzID09ICJhbGJpZ3VsYSIpICU+JQogIGdyb3VwX2J5KG1vbnRoKQphbGJpZ3VsYV9kYXRhCmBgYAoKPGJyPgoKV2UgdGhlbiBjYWxjbGF0ZSB0aGUgYXZlcmFnZSBvZiBib3RoIHdlaWdodCBhbmQgaGluZGZvb3QgbGVuZ3RoIGZvciB0aGUgc3BlY2llcyBfJ2FsYmlndWxhJ186CmBgYHtyfQogYWxiaWd1bGFfZGF0YSAlPiUKICBzdW1tYXJpc2UoYEF2ZXJhZ2UgV2VpZ2h0YCA9IG1lYW4od2VpZ2h0LCBuYS5ybSA9IFRSVUUpLAogICAgICAgICAgICBgQXZlcmFnZSBIaW5kZm9vdCBMZW5ndGhgID0gbWVhbihoaW5kZm9vdF9sZW5ndGgsIG5hLnJtID0gVFJVRSkKICApCmBgYAoKPGhyPgoKIyMgVGFzayA4OiBNaXNzaW5nIFZhbHVlcwoKIyMjIyBGaWx0ZXJpbmcgdGhlIGBzdXJ2ZXlzX2NvbWJpbmVkYCBkYXRhOgoKV2UgZmlsdGVyIHRoZSBgc3VydmV5c19jb21iaW5lZGAgZGF0YSBmb3IgdGhlIHllYXIgYDE5OTdgIGFuZCBzYXZlIGl0IGFzIHRoZSBkYXRhIHNldCBgc3VydmV5c19jb21iaW5lZF95ZWFyYC4KYGBge3J9CnN1cnZleXNfY29tYmluZWRfeWVhciAgPC0gc3VydmV5c19jb21iaW5lZCAlPiUgCiAgZmlsdGVyKHllYXIgPT0gMTk5NykKYGBgCgo8YnI+CgojIyMjIENvdW50aW5nIHRoZSBgTkFgIHZhbHVlcyBpbiBgd2VpZ2h0YCB2YXJpYWJsZSBmb3IgZWFjaCBzcGVjaWVzOgoKVGhlIGBzdXJ2ZXlzX2NvbWJpbmVkX3llYXJgIGRhdGEgaXMgZ3JvdXBlZCBieSBzcGVjaWVzIGFuZCB0aGUgbnVtYmVyIG9mIG1pc3NpbmcgdmFsdWVzIChgTkFgKSBpcyBjYWxjdWxhdGVkLgpgYGB7cn0Kc3VydmV5c19jb21iaW5lZF95ZWFyICU+JSAKICBncm91cF9ieShzcGVjaWVzKSAlPiUgCiAgc3VtbWFyaXNlKGBOQSBDb3VudGAgPSBzdW0oaXMubmEod2VpZ2h0KSkpICU+JSAKICBmaWx0ZXIoYE5BIENvdW50YCAhPSAwKQpgYGAKVGhlIG91dHB1dCBhYm92ZSBzaG93cyB0aGUgYDEzYCBzcGVjaWVzIHdoaWNoIGhhdmUgbWlzc2luZyAoYE5BYCkgdmFsdWVzLCBhbG9uZyB3aXRoIHRoZSBjb3VudCBvZiB0aGUgY29ycmVzcG9uZGluZyBgTkFgIHZhbHVlcy4gIFRoZSAzIHJlbWFpbmluZyBzcGVjaWVzIG5vdCBkaXNwbGF5ZWQgaGVyZSBhcmUgYHNwaWxvc29tYWAJLSAxMiwgYHRvcnJpZHVzYCAtIDIJYW5kIGBOQWAgLQkxCQoKPGJyPgoKIyMjIyBJbXB1dGluZyB0aGUgYE5BYCAobWlzc2luZykgdmFsdWVzOgoKTm93IHdlIGltcHV0ZSB0aGUgYE5BYCB2YWx1ZXMgd2l0aCB0aGUgbWVhbiB3ZWlnaHQgZm9yIGVhY2ggc3BlY2llcyB1c2luZyBgbXV0YXRlKClgLiBXZSBzYXZlIHRoZSBpbXB1dGVkIGRhdGEgYXMgYHN1cnZleXNfd2VpZ2h0X2ltcHV0ZWRgLgpgYGB7cn0KIyBncm91cCB0aGUgZGF0YSBieSBzcGVjaWVzIGFuZCBpbXB1dGUgdGhlIE5BIHZhbHVlcyB3aXRoIHRoZSBtZWFuIG9mIGVhY2ggc3BlY2llcwpzdXJ2ZXlzX3dlaWdodF9pbXB1dGVkIDwtIHN1cnZleXNfY29tYmluZWRfeWVhciAlPiUgCiAgZ3JvdXBfYnkoc3BlY2llcykgJT4lCiAgbXV0YXRlKHdlaWdodCA9IGlmZWxzZShpcy5uYSh3ZWlnaHQpLCBtZWFuKHdlaWdodCwgbmEucm09VFJVRSksIHdlaWdodCkpCmBgYApfVGhlIGNvZGUgYWJvdmUgaXMgd3JpdHRlbiBieSByZWZlcnJpbmcgdG8gdGhlIGNvZGUgZnJvbSBTdGFja292ZXJmbG93OiBbTGluayB0byBjb2RlXShodHRwczovL3N0YWNrb3ZlcmZsb3cuY29tL3F1ZXN0aW9ucy81NTM0NTU5My9pbXB1dGUtbWlzc2luZy1kYXRhLXdpdGgtbWVhbi1ieS1ncm91cClfCgo8YnI+CgojIyMjIyBDaGVja2luZyBpZiB0aGUgaW1wdXRhdGlvbiB3YXMgc3VjY2Vzc2Z1bCBpbiB0aGUgYHN1cnZleXNfd2VpZ2h0X2ltcHV0ZWRgIGRhdGE6CmBgYHtyfQpzdXJ2ZXlzX3dlaWdodF9pbXB1dGVkICU+JSAKICBncm91cF9ieShzcGVjaWVzKSAlPiUgCiAgc3VtbWFyaXNlKGBOQSBDb3VudGAgPSBzdW0oaXMubmEod2VpZ2h0KSkpICU+JSAKICBmaWx0ZXIoYE5BIENvdW50YCAhPSAwKQpgYGAKCkZyb20gdGhlIG91dHB1dCBhYm92ZSwgd2UgY2FuIG9ic2VydmUgdGhhdCB0aGUgYHN1cnZleXNfd2VpZ2h0X2ltcHV0ZWRgIHN0aWxsIGhhcyBmZXcgc3BlY2llcyB3aXRoIG1pc3NpbmcgdmFsdWVzLiBCdXQgIHdlIGNhbiBub3RlIHRoYXQgc29tZSBvZiB0aGUgbWlzc2luZyB2YWx1ZXMgaGF2ZSBiZWVuIGltcHV0ZWQgc3VjZXNzZnVsbHkuIE9mIHRoZSBgMTNgIHNwZWNpZXMgd2l0aCBtaXNzaW5nIHZhbHVlcywgdGhlIGltcHV0YXRpb24gc3VjY2VzZnVsbHkgaW1wdXRlZCB0aGUgbWlzc2luZyB2YWx1ZXMgZm9yIGA4YCBzcGVjaWVzLiBUaGUgcmVtYWluaW5nIGA1YCBzcGVjaWVzIHN0aWxsIHNob3cgbWlzc2luZyB2YWx1ZXMuIFdlIGluc3BlY3QgYW5kIGV4cGxhaW4gd2h5IHRoaXMgaGFwcGVucyBpbiB0aGUgbmV4dCB0YXNrLgoKPGhyPgoKIyMgVGFzayA5OiBTcGVjaWFsIFZhbHVlcwoKIyMjIyBDaGVja2luZyBmb3Igc3BlY2lhbCB2YWx1ZXMgaW4gYHdlaWdodGAgY29sdW1uIGluIGBzdXJ2ZXlzX3dlaWdodF9pbXB1dGVkYDoKYGBge3J9CnN1bShzYXBwbHkoc3VydmV5c193ZWlnaHRfaW1wdXRlZCR3ZWlnaHQsIGZ1bmN0aW9uKHgpIChpcy5pbmZpbml0ZSh4KSB8IGlzLm5hbih4KSApKSApCmBgYAoKIyMjIyMgSGVyZSBhcmUgdGhlIG51bWJlciBvZiBzcGVjaWFsIHZhbHVlcyBiZWZvcmUgd2UgaW1wdXRlZCB0aGUgYHN1cnZleXNfY29tYmluZWRfeWVhcmAgZGF0YToKYGBge3J9CnN1bShzYXBwbHkoc3VydmV5c19jb21iaW5lZF95ZWFyJHdlaWdodCwgZnVuY3Rpb24oeCkgKGlzLmluZmluaXRlKHgpIHwgaXMubmFuKHgpICkpICkKYGBgCgpPYnNlcnZpbmcgdGhlIG91dHB1dHMgYWJvdmUsIHdlIGNhbiBjb25jbHVkZSB0aGF0IHRoZSBzcGVjaWFsIHZhbHVlcyB3ZXJlIGdlbmVyYXRlZCBhcyBhIHJlc3VsdCBvZiB0aGUgaW1wdXRhdGlvbi4gQWxzbyB0aGUgZnVuY3Rpb24gLSBgaXMubmEoKWAgY291bnRzIGBOQWAgYXMgd2VsbCBhcyBgTmFOYCB2YWx1ZXMsIHdoaWNoIGlzIHdoeSB3ZSBnb3QgdGhlIGNvdW50IGZvciBgTkFgIHZhbHVlcyBhZnRlciBpbXB1dGF0aW9uLiBJbiByZWFsaXR5LCB3ZSB3ZXJlIGdldHRpbmcgdGhlIGNvdW50IG9mIGBOYU5gIHZhbHVlcyBzaW5jZSBhbGwgdGhlIGBOQWAgdmFsdWVzIHdlcmUgaW1wdXRlZC4gV2UgaW5zcGVjdCB0aGUgcmVhc29uIGZvciB0aGUgYE5hTmAgdmFsdWVzIHNob3dpbmcgdXAgaW4gdGhlIGRhdGEgYmVsb3cuCgo8YnI+CgojIyMjIEV4YW1pbmluZyB0aGUgZGF0YSBmb3IgYSBzcGVjaWVzIHRoYXQgc2hvd2VkIG1pc3NpbmcgdmFsdWVzIGFmdGVyIGltcHV0YXRpb246CmBgYHtyfQpmaWx0ZXIoc3VydmV5c193ZWlnaHRfaW1wdXRlZCwgc3BlY2llcyA9PSAiaGFycmlzaSIpCmBgYAoKQWxsIHRoZSBzcGVjaWFsIHZhbHVlcyBpbiB0aGUgd2VpZ2h0IGNvbHVtbiBmb3IgX3NwZWNpZXMgPSBgaGFycmlzaWBfIGFyZSBgTmFOYCB2YWx1ZXMuIFdlIGdldCBgTmFOYCB2YWx1ZXMgYmVjYXVzZSBmb3Igc29tZSBzcGVjaWVzIHN1Y2ggYXMgYGhhcnJpc2lgLCB0aGUgd2VpZ2h0IGNvbHVtbiBoYXMgb25seSBOQSB2YWx1ZXMuIFdoZW4gd2UgY2FsY3VsYXRlIHRoZSBtZWFuIGZvciB0aGVzZSBzcGVjaWVzLCB0byBpbXB1dGUgdGhlIE5BIHZhbHVlcyAoYXMgcGVyZm9ybWVkIGluIHRhc2sgOCkgd2UgdXNlIHRoZSBwYXJhbWV0ZXIgYG5hLnJtID0gVFJVRWAgdG8gaWdub3JlIE5BIHZhbHVlcyB3aGVuIGNhbGN1bGF0aW5nIHRoZSBtZWFuLiBGb3Igc3BlY2llcyBzdWNoIGFzIGBoYXJyaXNpYCwgd2hlbiBjYWxjdWxhdGluZyB0aGUgbWVhbiwgYWxsIHRoZSB2YWx1ZXMgYXJlIGlnbm9yZWQgcmVzdHVsdGluZyBpbiBgTmFOYCBiZWluZyByZXR1cm5lZCBhcyB0aGUgbWVhbi4gVGhpcyBjYXVzZWQgYE5hTmAgdmFsdWVzIHRvIGJlIGltcHV0ZWQgaW4gdGhlIHdlaWdodCBjb2x1bW4uCgo8YnI+CgojIyMjIyBDb25zaWRlciB0aGUgZm9sbG93aW5nIGV4YW1wbGUgdG8gdW5kZXJzdGFuZCBob3cgdGhlIE5hTiB2YWx1ZXMgd2VyZSBjcmVhdGVkOgpgYGB7cn0KbWVhbihjKE5BLCBOQSksIG5hLnJtID0gVFJVRSkKYGBgCgpXaGVuIHRoZSBtZWFuIGlzIGNhbGN1bGF0ZWQgZm9yIGEgdmVjdG9yIHdpdGggb25seSBOQXMsIG5hLnJtID0gVFJVRSBpZ25vcmVzIGFsbCB0aGUgTkEgdmFsdWVzLCBoZW5jZSByZXN1bHRpbmcgaW4gdGhlIGRlbm9taW5hdG9yIG9mIHRoZSBtZWFuIHRvIGJlIHplcm8sIHdoaWNoIGV2YWx1YXRlcyB0byBgTmFOYC4KCjxocj4KCiMjIFRhc2sgMTA6IE91dGxpZXJzCgojIyMjIEZpbmRpbmcgb3V0bGllcnMgaW4gYGhpbmRmb290X2xlbmd0aGAgdmFyaWFibGU6CgpXZSB0cnkgdG8gZGV0ZWN0IG91dGxpZXJzIGJ5IHVzaW5nIGEgYm94cGxvdC4KYGBge3J9CnN1cnZlcnlfb3V0bGllcnMgPC0gYm94cGxvdChzdXJ2ZXlzX2NvbWJpbmVkJGhpbmRmb290X2xlbmd0aCkkb3V0CmBgYAojIyMjIyBUaGUgb3V0bGllcnMgYXJlOgpgYGB7cn0Kc3VydmVyeV9vdXRsaWVycwpgYGAKClNpbmNlIHRoZXJlIGFyZSBhIGZldyBvdXRsaWVycywgd2UgY2hlY2sgdGhlIHBlcmNlbnRhZ2Ugb2Ygb3V0bGllcnMgaW4gYGhpbmRmb290IGxlbmd0aGAgdmFyaWFibGUuIApgYGB7cn0KKGxlbmd0aChzdXJ2ZXJ5X291dGxpZXJzKS9ucm93KHN1cnZleXNfY29tYmluZWQpKSAqIDEwMApgYGAKClNpbmNlIHRoZSBudW1iZXIgb2Ygb3V0bGllcnMgaXMgZXh0ZXJlbWx5IHNtYWxsIGAoMC4wMDU2JSlgIGNvbXBhcmVkIHRvIHRoZSBlbnRpcmUgZGF0YXNldCwgdGhlIGJlc3Qgd2F5IHRvIGhhbmRsZSB0aGVzZSBvdXRsaWVycyBpcyB0byBkZWxldGUgdGhlbS4gVGhlcmVmb3JlIHdlIGRlbGV0ZSB0aGUgdHdvIG91dGxpZXJzIGZvdW5kLiAKYGBge3J9CmNhdCgiTnVtYmVyIG9mIHJvd3MgYmVmb3JlIGRlbGV0aW5nIG91dGxpZXJzOiIgLCBucm93KHN1cnZleXNfY29tYmluZWQpICwgIlxuIikKc3VydmV5c19jb21iaW5lZCA8LSBzdXJ2ZXlzX2NvbWJpbmVkWy13aGljaChzdXJ2ZXlzX2NvbWJpbmVkJGhpbmRmb290X2xlbmd0aCAlaW4lIHN1cnZlcnlfb3V0bGllcnMpLF0KY2F0KCJOdW1iZXIgb2Ygcm93cyBhZnRlciBkZWxldGluZyBvdXRsaWVyczoiICwgbnJvdyhzdXJ2ZXlzX2NvbWJpbmVkKSAsICJcbiIpCmBgYAoKPGJyPgoKIyMjIyMgQm94cGxvdCBhZnRlciByZW1vdmluZyBvdXRsaWVyczoKYGBge3J9CnZhbDwtYm94cGxvdChzdXJ2ZXlzX2NvbWJpbmVkJGhpbmRmb290X2xlbmd0aCkkb3V0CmBgYAoKQ2hlY2tpbmcgaWYgYW55IG91dGxpZXJzIHJlbWFpbjoKYGBge3J9CnZhbApgYGAKCjxicj4KPGJyPgo=