Required packages
library(dplyr)
library(readr)
library(tidyr)
library(Hmisc)
library(outliers)
library(forecast)
Executive Summary
The report is to exhibit various skills that we have achieved in preprocessing the data throughout the semester. Open data sets have been taken for cleaning and transforming them into a tidy format which can be used for analysis. The follwing measures are taken for obtaining the desired data format.
- The csv file data sets are imported in R Studio using readr functions
- Due to the large size of the dat set, they are subsetted and them merged using join() and merge() functions of dplyr package
- The structure of the merged final dataset is obtained for further analysis of the attributes and understand the data types. Data type conversion has been applied to two variables which has been properly factorised, ordered and labelled appropriately.
- The created dataframe is verified whether it is in tidy format or not.
- A new column has been added to the final data set by using mutate().
- The dataframe is scanned for missing values and the NA values are replaced with mean of the value.
- Numeric values are taken for additional evaluation for chekcing the presence of outliers.The outliers found are removem using Tukey’s methor or z score method.
- Data transformation using square and log transformation method has been applied to one of the columns for cteating suitable data for statistical anylsyis purposes.
Data
The data set is a program conducted by “National Health and Nutrition Examination Survey” to assess the diet and medicationss in adults and children in United States.
The dataset the used are downloaded from the below link: https://www.kaggle.com/cdc/national-health-and-nutrition-examination-survey#medications.csv
Three data sets have been selected and merged together to form new data set. demographic.csv diet.csv medication.csv
These data sets are imported using readr function:
DEMOGRAPHICS_INTIAL <- read.csv("C:/Users/Niki/Desktop/DataPreprocessing/Assignment3/national-health-and-nutrition-examination-survey/demographic.csv",stringsAsFactors = FALSE)
head(DEMOGRAPHICS_INTIAL)
DIET_INITIAL <- read.csv("C:/Users/Niki/Desktop/DataPreprocessing/Assignment3/national-health-and-nutrition-examination-survey/diet.csv",stringsAsFactors = FALSE)
head(DIET_INITIAL)
MEDICATIONS_INITIAL <-read.csv("C:/Users/Niki/Desktop/DataPreprocessing/Assignment3/national-health-and-nutrition-examination-survey/medications.csv",stringsAsFactors = FALSE)
head(MEDICATIONS_INITIAL)
Due to the large size of the data set, data is subsetted and the attributes of the subsetted data is explained below:
Attributes of demographic.csv
SEQN - Respondent sequence number SDDSRVYR - Data release cycle RIDSTATR - Interview/Examination status RIAGENDR - Gender RIDAGEYR - Age in years at screening
Attributes of diet.csv
SEQN - Respondent sequence number WTDRD1 - Dietary day one sample weight WTDR2D - Dietary two-day sample weight
Attributes of medication.csv
SEQN - Respondent sequence number RXDUSE - prescription medicine taken in past month RXDDRUG - Generic drug name RXDDRGID - Generic drug code RXDRSD1 - Drug description
DEMOGRAPHICS <-DEMOGRAPHICS_INTIAL[1:100,1:5]
head(DEMOGRAPHICS)
DIET <- DIET_INITIAL[1:100,1:3]
head(DIET)
MEDICATIONS<- MEDICATIONS_INITIAL[1:100,1:5]
head(MEDICATIONS)
The subsetted data is merged using the common attribute SEQN
JOINED_DATA <- DEMOGRAPHICS %>% left_join(DIET,BY="SEQN")
Joining, by = "SEQN"
head(JOINED_DATA)
FINALDATA <-merge(JOINED_DATA,MEDICATIONS, BY="SEQN",ALL=T)
head(FINALDATA)
NA
Understand
The steps for understanding the dtaa are as follows:
- The merged dataset contains different datatypes as its attributes which is obtained using str().
- The intervie status variable RIDSTATR and Gender specific variable RIAGENDR are converted to factor variables with properly ordered and labelled appropriately.
str(FINALDATA)
'data.frame': 100 obs. of 11 variables:
$ SEQN : int 73557 73557 73558 73558 73558 73558 73559 73559 73559 73559 ...
$ SDDSRVYR: int 8 8 8 8 8 8 8 8 8 8 ...
$ RIDSTATR: int 2 2 2 2 2 2 2 2 2 2 ...
$ RIAGENDR: int 1 1 1 1 1 1 1 1 1 1 ...
$ RIDAGEYR: int 69 69 54 54 54 54 72 72 72 72 ...
$ WTDRD1 : num 16888 16888 17932 17932 17932 ...
$ WTDR2D : num 12931 12931 12684 12684 12684 ...
$ RXDUSE : int 1 1 1 1 1 1 1 1 1 1 ...
$ RXDDRUG : chr "GABAPENTIN" "INSULIN" "GABAPENTIN" "INSULIN GLARGINE" ...
$ RXDDRGID: chr "d03182" "d00262" "d03182" "d04538" ...
$ RXDRSD1 : chr "Muscle spasm" "Type 2 diabetes mellitus" "Restless legs syndrome" "Type 2 diabetes mellitus" ...
FINALDATA$RIDSTATR = FINALDATA$RIDSTATR %>% factor(levels = c("1","2"),
labels = c("Interviewed","Interviewed and Examined"),ordered = TRUE)
FINALDATA$RIAGENDR = FINALDATA$RIAGENDR %>% factor(levels = c("1","2"),
labels = c("Male","Female"),ordered = TRUE)
Tidy & Manipulate Data I
- To obtain analysis on the final dataset is done through str() where the type,range of each variable is shown in the output.
- It is clear from the output that the data is in tidy format and no additional steps are required.
str(FINALDATA)
'data.frame': 100 obs. of 11 variables:
$ SEQN : int 73557 73557 73558 73558 73558 73558 73559 73559 73559 73559 ...
$ SDDSRVYR: int 8 8 8 8 8 8 8 8 8 8 ...
$ RIDSTATR: Ord.factor w/ 2 levels "Interviewed"<..: 2 2 2 2 2 2 2 2 2 2 ...
$ RIAGENDR: Ord.factor w/ 2 levels "Male"<"Female": 1 1 1 1 1 1 1 1 1 1 ...
$ RIDAGEYR: int 69 69 54 54 54 54 72 72 72 72 ...
$ WTDRD1 : num 16888 16888 17932 17932 17932 ...
$ WTDR2D : num 12931 12931 12684 12684 12684 ...
$ RXDUSE : int 1 1 1 1 1 1 1 1 1 1 ...
$ RXDDRUG : chr "GABAPENTIN" "INSULIN" "GABAPENTIN" "INSULIN GLARGINE" ...
$ RXDDRGID: chr "d03182" "d00262" "d03182" "d04538" ...
$ RXDRSD1 : chr "Muscle spasm" "Type 2 diabetes mellitus" "Restless legs syndrome" "Type 2 diabetes mellitus" ...
Tidy & Manipulate Data II
- A new column is created “DIFF_IN_FOODINTAKE” using mutate() where difference between dietary intake of day1 (WTDRD1) and dietary intake of day2 (WTDR2D) is taken.
FINALDATA = mutate(FINALDATA, DIFF_IN_FOODINTAKE=(WTDRD1-WTDR2D))
head(FINALDATA)
Scan I
- Missing values present in the data frames have been scanned in the below section
- Missing values are substitued with appropriate values. The missing values in sequence id is replaced with unspecified and data release cycle with the mean of the value
colSums(is.na(FINALDATA))
SEQN SDDSRVYR RIDSTATR RIAGENDR RIDAGEYR WTDRD1
0 0 0 0 0 2
WTDR2D RXDUSE RXDDRUG RXDDRGID RXDRSD1 DIFF_IN_FOODINTAKE
5 0 0 0 0 5
FINALDATA$SEQN[is.na(FINALDATA$SEQN)] <-"Unspecified"
FINALDATA$SDDSRVYR <- impute(FINALDATA$SDDSRVYR, fun=mean)
FINALDATA= na.omit(FINALDATA)
head(FINALDATA,10)
Scan II
- The ouliers are plotted using boxplot and z score method.
- The outiers are treated using Tukey’s method and z score method
- Plotted boxplots after the above method treatment for confirming the removal of the outliers.
FINALDATA$WTDRD1 %>% boxplot(main="Boxplot of Dietary - Day 1 Sample weight", ylab="Sample Weight-Day1",col = "grey")

IQRSLOTS <- IQR(FINALDATA$WTDRD1, na.rm = TRUE)
q3SLOTS = quantile(FINALDATA$WTDRD1, .75, na.rm = TRUE)
FINALDATA<- filter(FINALDATA, FINALDATA$WTDRD1 <= 1.5 * IQRSLOTS + q3SLOTS)
FINALDATA$WTDRD1 %>% boxplot(main="Boxplot of Dietary - Day 1 Sample weight (No Outliers)", ylab="Sample Weight-Day1", col = "grey")

FINALDATA$WTDR2D %>% boxplot(main="Boxplot of Dietary - Day 2 Sample weight", ylab="Sample Weight-Day2",col = "grey")

Z.SCORE <- FINALDATA$WTDR2D %>% scores(type = "z")
FINALDATA<- FINALDATA[-which( abs(Z.SCORE) >3 ),]
FINALDATA
IQRDIETOFDAY2 <- IQR(FINALDATA$WTDR2D, na.rm = TRUE)
Q3WTDR2D=quantile(FINALDATA$WTDR2D, .75, na.rm = TRUE)
FINALDATA<- filter(FINALDATA, FINALDATA$WTDR2D <= 1.5 * IQRDIETOFDAY2 + Q3WTDR2D)
FINALDATA$WTDR2D %>% boxplot(main="Boxplot of Dietary - Day 2 Sample weight (No Outliers)", ylab="Sample Weight-Day2", col = "grey",outline = FALSE)

NA
NA
LS0tDQp0aXRsZTogIk1BVEgyMzQ5IFNlbWVzdGVyIDIsIDIwMTkiDQphdXRob3I6ICdOaWtpdGhhIE5haXIgSUQ6IHMzNzkwNDMxJw0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgZGZfcHJpbnQ6IHBhZ2VkDQpzdWJ0aXRsZTogQXNzaWdubWVudCAzDQpTdHVkZW50IElkOiBzMzc5MDQzMQ0KLS0tDQojIyBSZXF1aXJlZCBwYWNrYWdlcyANCg0KDQoNCmBgYHtyfQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkocmVhZHIpDQpsaWJyYXJ5KHRpZHlyKSANCmxpYnJhcnkoSG1pc2MpIA0KbGlicmFyeShvdXRsaWVycykgDQpsaWJyYXJ5KGZvcmVjYXN0KQ0KYGBgDQoNCg0KIyMgRXhlY3V0aXZlIFN1bW1hcnkgDQoNClRoZSByZXBvcnQgaXMgdG8gZXhoaWJpdCB2YXJpb3VzIHNraWxscyB0aGF0IHdlIGhhdmUgYWNoaWV2ZWQgaW4gcHJlcHJvY2Vzc2luZyB0aGUgZGF0YSB0aHJvdWdob3V0IHRoZSBzZW1lc3Rlci4gT3BlbiBkYXRhIHNldHMgaGF2ZSBiZWVuIHRha2VuIGZvciBjbGVhbmluZyBhbmQgdHJhbnNmb3JtaW5nIHRoZW0gaW50byBhIHRpZHkgZm9ybWF0IHdoaWNoIGNhbiBiZSB1c2VkIGZvciBhbmFseXNpcy4gVGhlIGZvbGx3aW5nIG1lYXN1cmVzIGFyZSB0YWtlbiBmb3Igb2J0YWluaW5nIHRoZSBkZXNpcmVkIGRhdGEgZm9ybWF0Lg0KDQoqIFRoZSBjc3YgZmlsZSBkYXRhIHNldHMgYXJlIGltcG9ydGVkIGluIFIgU3R1ZGlvIHVzaW5nIHJlYWRyIGZ1bmN0aW9ucw0KKiBEdWUgdG8gdGhlIGxhcmdlIHNpemUgb2YgdGhlIGRhdCBzZXQsIHRoZXkgYXJlIHN1YnNldHRlZCBhbmQgdGhlbSBtZXJnZWQgdXNpbmcgam9pbigpIGFuZCBtZXJnZSgpIGZ1bmN0aW9ucyBvZiBkcGx5ciAgICAgICBwYWNrYWdlDQoqIFRoZSBzdHJ1Y3R1cmUgb2YgdGhlIG1lcmdlZCBmaW5hbCBkYXRhc2V0IGlzICBvYnRhaW5lZCBmb3IgZnVydGhlciBhbmFseXNpcyBvZiB0aGUgYXR0cmlidXRlcyBhbmQgdW5kZXJzdGFuZCB0aGUgZGF0YSAgICAgIHR5cGVzLiBEYXRhIHR5cGUgY29udmVyc2lvbiBoYXMgYmVlbiBhcHBsaWVkIHRvIHR3byB2YXJpYWJsZXMgd2hpY2ggaGFzIGJlZW4gcHJvcGVybHkgZmFjdG9yaXNlZCwgb3JkZXJlZCBhbmQgbGFiZWxsZWQgICAgIGFwcHJvcHJpYXRlbHkuIA0KKiBUaGUgY3JlYXRlZCBkYXRhZnJhbWUgaXMgdmVyaWZpZWQgd2hldGhlciBpdCBpcyBpbiB0aWR5IGZvcm1hdCBvciBub3QuDQoqIEEgbmV3IGNvbHVtbiBoYXMgYmVlbiBhZGRlZCB0byB0aGUgZmluYWwgZGF0YSBzZXQgYnkgdXNpbmcgbXV0YXRlKCkuDQoqIFRoZSBkYXRhZnJhbWUgaXMgc2Nhbm5lZCBmb3IgbWlzc2luZyB2YWx1ZXMgYW5kIHRoZSBOQSB2YWx1ZXMgYXJlIHJlcGxhY2VkIHdpdGggbWVhbiBvZiB0aGUgdmFsdWUuDQoqIE51bWVyaWMgdmFsdWVzIGFyZSB0YWtlbiBmb3IgYWRkaXRpb25hbCBldmFsdWF0aW9uIGZvciBjaGVrY2luZyB0aGUgcHJlc2VuY2Ugb2Ygb3V0bGllcnMuVGhlIG91dGxpZXJzIGZvdW5kIGFyZSByZW1vdmVtICAgIHVzaW5nIFR1a2V5J3MgbWV0aG9yIG9yIHogc2NvcmUgbWV0aG9kLg0KKiBEYXRhIHRyYW5zZm9ybWF0aW9uIHVzaW5nIHNxdWFyZSBhbmQgbG9nIHRyYW5zZm9ybWF0aW9uIG1ldGhvZCBoYXMgYmVlbiBhcHBsaWVkIHRvIG9uZSBvZiB0aGUgY29sdW1ucyBmb3IgY3RlYXRpbmcgICAgICAgICBzdWl0YWJsZSBkYXRhIGZvciBzdGF0aXN0aWNhbCBhbnlsc3lpcyBwdXJwb3Nlcy4NCg0KDQojIyBEYXRhIA0KDQpUaGUgZGF0YSBzZXQgaXMgYSBwcm9ncmFtIGNvbmR1Y3RlZCBieSAiTmF0aW9uYWwgSGVhbHRoIGFuZCBOdXRyaXRpb24gRXhhbWluYXRpb24gU3VydmV5IiB0byBhc3Nlc3MgdGhlIGRpZXQgYW5kIG1lZGljYXRpb25zcyBpbiBhZHVsdHMgYW5kIGNoaWxkcmVuIGluIFVuaXRlZCBTdGF0ZXMuDQoNClRoZSBkYXRhc2V0IHRoZSB1c2VkIGFyZSBkb3dubG9hZGVkIGZyb20gdGhlIGJlbG93IGxpbms6DQpodHRwczovL3d3dy5rYWdnbGUuY29tL2NkYy9uYXRpb25hbC1oZWFsdGgtYW5kLW51dHJpdGlvbi1leGFtaW5hdGlvbi1zdXJ2ZXkjbWVkaWNhdGlvbnMuY3N2DQoNClRocmVlIGRhdGEgc2V0cyBoYXZlIGJlZW4gc2VsZWN0ZWQgYW5kIG1lcmdlZCB0b2dldGhlciB0byBmb3JtIG5ldyBkYXRhIHNldC4NCmRlbW9ncmFwaGljLmNzdg0KZGlldC5jc3YNCm1lZGljYXRpb24uY3N2DQoNClRoZXNlIGRhdGEgc2V0cyBhcmUgaW1wb3J0ZWQgdXNpbmcgcmVhZHIgZnVuY3Rpb246DQoNCmBgYHtyfQ0KREVNT0dSQVBISUNTX0lOVElBTCA8LSByZWFkLmNzdigiQzovVXNlcnMvTmlraS9EZXNrdG9wL0RhdGFQcmVwcm9jZXNzaW5nL0Fzc2lnbm1lbnQzL25hdGlvbmFsLWhlYWx0aC1hbmQtbnV0cml0aW9uLWV4YW1pbmF0aW9uLXN1cnZleS9kZW1vZ3JhcGhpYy5jc3YiLHN0cmluZ3NBc0ZhY3RvcnMgPSBGQUxTRSkNCmhlYWQoREVNT0dSQVBISUNTX0lOVElBTCkNCkRJRVRfSU5JVElBTCA8LSByZWFkLmNzdigiQzovVXNlcnMvTmlraS9EZXNrdG9wL0RhdGFQcmVwcm9jZXNzaW5nL0Fzc2lnbm1lbnQzL25hdGlvbmFsLWhlYWx0aC1hbmQtbnV0cml0aW9uLWV4YW1pbmF0aW9uLXN1cnZleS9kaWV0LmNzdiIsc3RyaW5nc0FzRmFjdG9ycyA9IEZBTFNFKQ0KaGVhZChESUVUX0lOSVRJQUwpDQpNRURJQ0FUSU9OU19JTklUSUFMIDwtcmVhZC5jc3YoIkM6L1VzZXJzL05pa2kvRGVza3RvcC9EYXRhUHJlcHJvY2Vzc2luZy9Bc3NpZ25tZW50My9uYXRpb25hbC1oZWFsdGgtYW5kLW51dHJpdGlvbi1leGFtaW5hdGlvbi1zdXJ2ZXkvbWVkaWNhdGlvbnMuY3N2IixzdHJpbmdzQXNGYWN0b3JzID0gRkFMU0UpDQpoZWFkKE1FRElDQVRJT05TX0lOSVRJQUwpDQpgYGANCg0KRHVlIHRvIHRoZSBsYXJnZSBzaXplIG9mIHRoZSBkYXRhIHNldCwgZGF0YSBpcyBzdWJzZXR0ZWQgYW5kIHRoZSBhdHRyaWJ1dGVzIG9mIHRoZSBzdWJzZXR0ZWQgZGF0YSBpcyBleHBsYWluZWQgYmVsb3c6DQoNCkF0dHJpYnV0ZXMgb2YgZGVtb2dyYXBoaWMuY3N2DQoNClNFUU4gLSBSZXNwb25kZW50IHNlcXVlbmNlIG51bWJlcg0KU0REU1JWWVIgLSBEYXRhIHJlbGVhc2UgY3ljbGUNClJJRFNUQVRSIC0gSW50ZXJ2aWV3L0V4YW1pbmF0aW9uIHN0YXR1cw0KUklBR0VORFIgLSBHZW5kZXINClJJREFHRVlSIC0gQWdlIGluIHllYXJzIGF0IHNjcmVlbmluZw0KDQpBdHRyaWJ1dGVzIG9mIGRpZXQuY3N2DQoNClNFUU4gLSBSZXNwb25kZW50IHNlcXVlbmNlIG51bWJlcg0KV1REUkQxIC0gRGlldGFyeSBkYXkgb25lIHNhbXBsZSB3ZWlnaHQNCldURFIyRCAtIERpZXRhcnkgdHdvLWRheSBzYW1wbGUgd2VpZ2h0DQoNCkF0dHJpYnV0ZXMgb2YgbWVkaWNhdGlvbi5jc3YNCg0KU0VRTiAtIFJlc3BvbmRlbnQgc2VxdWVuY2UgbnVtYmVyDQpSWERVU0UgLSBwcmVzY3JpcHRpb24gbWVkaWNpbmUgdGFrZW4gaW4gcGFzdCBtb250aA0KUlhERFJVRyAtIEdlbmVyaWMgZHJ1ZyBuYW1lDQpSWEREUkdJRCAtIEdlbmVyaWMgZHJ1ZyBjb2RlDQpSWERSU0QxIC0gRHJ1ZyBkZXNjcmlwdGlvbg0KDQpgYGB7cn0NCkRFTU9HUkFQSElDUyA8LURFTU9HUkFQSElDU19JTlRJQUxbMToxMDAsMTo1XQ0KaGVhZChERU1PR1JBUEhJQ1MpDQpESUVUIDwtIERJRVRfSU5JVElBTFsxOjEwMCwxOjNdDQpoZWFkKERJRVQpDQpNRURJQ0FUSU9OUzwtIE1FRElDQVRJT05TX0lOSVRJQUxbMToxMDAsMTo1XQ0KaGVhZChNRURJQ0FUSU9OUykNCmBgYA0KVGhlIHN1YnNldHRlZCBkYXRhIGlzIG1lcmdlZCB1c2luZyB0aGUgY29tbW9uIGF0dHJpYnV0ZSBTRVFOIA0KYGBge3J9DQpKT0lORURfREFUQSA8LSBERU1PR1JBUEhJQ1MgJT4lIGxlZnRfam9pbihESUVULEJZPSJTRVFOIikNCmhlYWQoSk9JTkVEX0RBVEEpDQpGSU5BTERBVEEgPC1tZXJnZShKT0lORURfREFUQSxNRURJQ0FUSU9OUywgQlk9IlNFUU4iLEFMTD1UKQ0KaGVhZChGSU5BTERBVEEpDQoNCmBgYA0KDQojIyBVbmRlcnN0YW5kIA0KDQpUaGUgc3RlcHMgZm9yIHVuZGVyc3RhbmRpbmcgdGhlIGR0YWEgYXJlIGFzIGZvbGxvd3M6DQoNCiogVGhlIG1lcmdlZCBkYXRhc2V0IGNvbnRhaW5zIGRpZmZlcmVudCBkYXRhdHlwZXMgYXMgaXRzIGF0dHJpYnV0ZXMgd2hpY2ggaXMgb2J0YWluZWQgdXNpbmcgc3RyKCkuDQoqIFRoZSBpbnRlcnZpZSBzdGF0dXMgdmFyaWFibGUgUklEU1RBVFIgYW5kIEdlbmRlciBzcGVjaWZpYyB2YXJpYWJsZSBSSUFHRU5EUiBhcmUgY29udmVydGVkIHRvIGZhY3RvciB2YXJpYWJsZXMgd2l0aCAgICAgICAgIHByb3Blcmx5IG9yZGVyZWQgYW5kIGxhYmVsbGVkIGFwcHJvcHJpYXRlbHkuDQoNCmBgYHtyfQ0Kc3RyKEZJTkFMREFUQSkNCg0KRklOQUxEQVRBJFJJRFNUQVRSID0gRklOQUxEQVRBJFJJRFNUQVRSICU+JSBmYWN0b3IobGV2ZWxzID0gYygiMSIsIjIiKSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGxhYmVscyA9IGMoIkludGVydmlld2VkIiwiSW50ZXJ2aWV3ZWQgYW5kIEV4YW1pbmVkIiksb3JkZXJlZCA9IFRSVUUpDQoNCkZJTkFMREFUQSRSSUFHRU5EUiA9IEZJTkFMREFUQSRSSUFHRU5EUiAlPiUgZmFjdG9yKGxldmVscyA9IGMoIjEiLCIyIiksDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBsYWJlbHMgPSBjKCJNYWxlIiwiRmVtYWxlIiksb3JkZXJlZCA9IFRSVUUpDQoNCmBgYA0KDQoNCiMjIFRpZHkgJiBNYW5pcHVsYXRlIERhdGEgSSANCg0KKiBUbyBvYnRhaW4gYW5hbHlzaXMgb24gdGhlIGZpbmFsIGRhdGFzZXQgaXMgZG9uZSB0aHJvdWdoIHN0cigpIHdoZXJlIHRoZSB0eXBlLHJhbmdlIG9mIGVhY2ggdmFyaWFibGUgaXMgc2hvd24gaW4gdGhlICAgICAgICBvdXRwdXQuDQoqIEl0IGlzIGNsZWFyIGZyb20gdGhlIG91dHB1dCB0aGF0IHRoZSBkYXRhIGlzIGluIHRpZHkgZm9ybWF0IGFuZCBubyBhZGRpdGlvbmFsIHN0ZXBzIGFyZSByZXF1aXJlZC4NCg0KYGBge3J9DQpzdHIoRklOQUxEQVRBKQ0KYGBgDQoNCiMjCVRpZHkgJiBNYW5pcHVsYXRlIERhdGEgSUkgDQoNCiogQSBuZXcgY29sdW1uIGlzIGNyZWF0ZWQgIkRJRkZfSU5fRk9PRElOVEFLRSIgdXNpbmcgbXV0YXRlKCkgd2hlcmUgZGlmZmVyZW5jZSBiZXR3ZWVuIGRpZXRhcnkgaW50YWtlIG9mIGRheTEgKFdURFJEMSkgYW5kICAgZGlldGFyeSBpbnRha2Ugb2YgZGF5MiAoV1REUjJEKSBpcyB0YWtlbi4NCg0KYGBge3J9DQpGSU5BTERBVEEgPSBtdXRhdGUoRklOQUxEQVRBLCBESUZGX0lOX0ZPT0RJTlRBS0U9KFdURFJEMS1XVERSMkQpKQ0KaGVhZChGSU5BTERBVEEpDQpgYGANCg0KDQojIwlTY2FuIEkgDQoNCiogTWlzc2luZyB2YWx1ZXMgcHJlc2VudCBpbiB0aGUgZGF0YSBmcmFtZXMgaGF2ZSBiZWVuIHNjYW5uZWQgaW4gdGhlIGJlbG93IHNlY3Rpb24NCiogTWlzc2luZyB2YWx1ZXMgYXJlIHN1YnN0aXR1ZWQgd2l0aCBhcHByb3ByaWF0ZSB2YWx1ZXMuIFRoZSBtaXNzaW5nIHZhbHVlcyBpbiBzZXF1ZW5jZSBpZCBpcyByZXBsYWNlZCB3aXRoIHVuc3BlY2lmaWVkIGFuZCAgIGRhdGEgcmVsZWFzZSBjeWNsZSAgd2l0aCB0aGUgbWVhbiBvZiB0aGUgdmFsdWUNCg0KYGBge3J9DQpjb2xTdW1zKGlzLm5hKEZJTkFMREFUQSkpDQpGSU5BTERBVEEkU0VRTltpcy5uYShGSU5BTERBVEEkU0VRTildIDwtIlVuc3BlY2lmaWVkIg0KRklOQUxEQVRBJFNERFNSVllSIDwtIGltcHV0ZShGSU5BTERBVEEkU0REU1JWWVIsIGZ1bj1tZWFuKQ0KRklOQUxEQVRBPSBuYS5vbWl0KEZJTkFMREFUQSkNCmhlYWQoRklOQUxEQVRBLDEwKQ0KYGBgDQoNCg0KIyMJU2NhbiBJSQ0KDQoqIFRoZSBvdWxpZXJzIGFyZSBwbG90dGVkIHVzaW5nIGJveHBsb3QgYW5kIHogc2NvcmUgbWV0aG9kLg0KKiBUaGUgb3V0aWVycyBhcmUgdHJlYXRlZCB1c2luZyBUdWtleSdzIG1ldGhvZCBhbmQgIHogc2NvcmUgbWV0aG9kDQoqIFBsb3R0ZWQgYm94cGxvdHMgYWZ0ZXIgdGhlIGFib3ZlIG1ldGhvZCB0cmVhdG1lbnQgZm9yIGNvbmZpcm1pbmcgdGhlIHJlbW92YWwgb2YgdGhlIG91dGxpZXJzLg0KDQpgYGB7cn0NCkZJTkFMREFUQSRXVERSRDEgJT4lIGJveHBsb3QobWFpbj0iQm94cGxvdCBvZiBEaWV0YXJ5IC0gRGF5IDEgU2FtcGxlIHdlaWdodCIsIHlsYWI9IlNhbXBsZSBXZWlnaHQtRGF5MSIsY29sID0gImdyZXkiKQ0KDQpJUVJTTE9UUyA8LSBJUVIoRklOQUxEQVRBJFdURFJEMSwgbmEucm0gPSBUUlVFKQ0KcTNTTE9UUyA9IHF1YW50aWxlKEZJTkFMREFUQSRXVERSRDEsIC43NSwgbmEucm0gPSBUUlVFKQ0KRklOQUxEQVRBPC0gZmlsdGVyKEZJTkFMREFUQSwgRklOQUxEQVRBJFdURFJEMSA8PSAxLjUgKiBJUVJTTE9UUyArIHEzU0xPVFMpIA0KRklOQUxEQVRBJFdURFJEMSAlPiUgIGJveHBsb3QobWFpbj0iQm94cGxvdCBvZiBEaWV0YXJ5IC0gRGF5IDEgU2FtcGxlIHdlaWdodCAoTm8gT3V0bGllcnMpIiwgeWxhYj0iU2FtcGxlIFdlaWdodC1EYXkxIiwgY29sID0gImdyZXkiKQ0KDQoNCkZJTkFMREFUQSRXVERSMkQgJT4lIGJveHBsb3QobWFpbj0iQm94cGxvdCBvZiBEaWV0YXJ5IC0gRGF5IDIgU2FtcGxlIHdlaWdodCIsIHlsYWI9IlNhbXBsZSBXZWlnaHQtRGF5MiIsY29sID0gImdyZXkiKQ0KDQpaLlNDT1JFIDwtIEZJTkFMREFUQSRXVERSMkQgICU+JSAgc2NvcmVzKHR5cGUgPSAieiIpIA0KRklOQUxEQVRBPC0gRklOQUxEQVRBWy13aGljaCggYWJzKFouU0NPUkUpID4zICksXSANCkZJTkFMREFUQQ0KSVFSRElFVE9GREFZMiA8LSBJUVIoRklOQUxEQVRBJFdURFIyRCwgbmEucm0gPSBUUlVFKSANClEzV1REUjJEPXF1YW50aWxlKEZJTkFMREFUQSRXVERSMkQsIC43NSwgbmEucm0gPSBUUlVFKSANCkZJTkFMREFUQTwtIGZpbHRlcihGSU5BTERBVEEsIEZJTkFMREFUQSRXVERSMkQgPD0gMS41ICogSVFSRElFVE9GREFZMiArIFEzV1REUjJEKQ0KDQpGSU5BTERBVEEkV1REUjJEICAlPiUgIGJveHBsb3QobWFpbj0iQm94cGxvdCBvZiBEaWV0YXJ5IC0gRGF5IDIgU2FtcGxlIHdlaWdodCAoTm8gT3V0bGllcnMpIiwgeWxhYj0iU2FtcGxlIFdlaWdodC1EYXkyIiwgY29sID0gImdyZXkiLG91dGxpbmUgPSBGQUxTRSkNCg0KDQpgYGANCg0KDQojIwlUcmFuc2Zvcm0gDQoNCiogSGlzdG9ncmFtIGhhdmUgYmVlbiBwbG90dGVkIGZvciBvbmUgb2YgdGhlIG51bWVyaWMgYXR0cmlidXRlDQoqIFRoZSBwbG90IGlzIHZpZXdkIHRvIGJlIHNsaWdodGx5IHJpZ2h0IHNrZXdlZA0KKiBUcmFuc2Zvcm1hdGlvbiBoYXZlIGJlZW4gYWNoaWV2ZWQgYnkgY29udmVydGluZyB0aGUgdmFsdWVzIGludG8gaGlnaGVyIHBvd2VyIGFuZCB0YWtpbmcgIGxvZyB0cmFuc2Zvcm1hdGlvbi4NCg0KYGBge3J9DQpoaXN0KEZJTkFMREFUQSRXVERSMkQpDQpTUVJGSU5BTERBVEE8LSBGSU5BTERBVEEkV1REUjJEXjIgDQpoaXN0KFNRUkZJTkFMREFUQSxtYWluID0gIkRpc3RyaWJ1dGlvbiBvZiBEaWV0IEludGFrZSBpbiBEYXkgMiBzYW1wbGVzIix4bGFiID0gIkRheSAyIGludGFrZSBvZiBkaWV0KHNxdWFyZSB0cmFuc2Zvcm1hdGlvbiBhcHBsaWVkKSIpDQpTUUVMT0dNRVJHRTwtbG9nKFNRUkZJTkFMREFUQSkgDQpoaXN0KFNRRUxPR01FUkdFLG1haW4gPSAiRGlzdHJpYnV0aW9uIG9mIERpZXQgSW50YWtlIGluIERheSAyIHNhbXBsZXMgYWZ0ZXIgbG9nIHRyYW5zZm9ybWF0aW9uIGlzIGFwcGxpZWQiLHhsYWI9IkRheSAyIGludGFrZSBvZiBkaWV0KGxvZyB0cmFuc2Zvcm1hdGlvbiBhcHBsaWVkIG9uIHRvcCBvZiBzcXVhcmUgdHJhbnNmb3JtYXRpb24pIikNCg0KYGBgDQoNCg0KPGJyPg0KPGJyPg0K