Motivation and Goals

In this notebook we analyze U.S. universities and colleges. Data file is named ‘College.csv’ and it can be easily obtained from here.

Our goal for this data set is to perform Exploratory Data Analysis (EDA). At this point we do not seek to build any predictive models, we are simply looking to gain some insights from the data set.

Normally, data science process requires cleaning the data and imputing missing values. However, the providers of this data set have ensured its completeness and there are no missing values in this data set. Regardless of the data provider’s kindness, we will follow data science etiquette, which involves treating the data as if it contained missing values and required cleaning.

Exploratory Data Analysis

To aid our analysis we begin by loading the data set into our RStudio environment.

college <- read.csv("College.csv", header = T, na.strings = c("", "NA"))
head(college)

Next lets check the shape of our data set or the dimensions of the data set.

dim(college)
[1] 777  19

Now that we have loaded the data set into our environment, we can check if there are any missing values. However, as stated earlier, the data set is clean and complete - we check for good measure and practice.

sum(is.na(college))
[1] 0

As we can see the above function returns us a value of 0, indicating that there are no missing values in our data set or values marked as NA.

Also note that when we look at the sample of the data set we see that X will actually get treated as a part of the data set for calculations, and we want it to be treated as the name of rows. We can change this by assigning column 1 as row names and deleting column 1 because rownames will represent this information.

rownames(college) <- college[,1]
college <- college[, -1]
head(college)

As we can see, first column changed from X to row names (this means that the names are no longer treated as a feature vector, which is why X is replaced by a whitespace).

Let’s take a look at the summary of this data set.

summary(college)
   Private               Apps           Accept          Enroll    
 Length:777         Min.   :   81   Min.   :   72   Min.   :  35  
 Class :character   1st Qu.:  776   1st Qu.:  604   1st Qu.: 242  
 Mode  :character   Median : 1558   Median : 1110   Median : 434  
                    Mean   : 3002   Mean   : 2019   Mean   : 780  
                    3rd Qu.: 3624   3rd Qu.: 2424   3rd Qu.: 902  
                    Max.   :48094   Max.   :26330   Max.   :6392  
   Top10perc       Top25perc      F.Undergrad     P.Undergrad     
 Min.   : 1.00   Min.   :  9.0   Min.   :  139   Min.   :    1.0  
 1st Qu.:15.00   1st Qu.: 41.0   1st Qu.:  992   1st Qu.:   95.0  
 Median :23.00   Median : 54.0   Median : 1707   Median :  353.0  
 Mean   :27.56   Mean   : 55.8   Mean   : 3700   Mean   :  855.3  
 3rd Qu.:35.00   3rd Qu.: 69.0   3rd Qu.: 4005   3rd Qu.:  967.0  
 Max.   :96.00   Max.   :100.0   Max.   :31643   Max.   :21836.0  
    Outstate       Room.Board       Books           Personal   
 Min.   : 2340   Min.   :1780   Min.   :  96.0   Min.   : 250  
 1st Qu.: 7320   1st Qu.:3597   1st Qu.: 470.0   1st Qu.: 850  
 Median : 9990   Median :4200   Median : 500.0   Median :1200  
 Mean   :10441   Mean   :4358   Mean   : 549.4   Mean   :1341  
 3rd Qu.:12925   3rd Qu.:5050   3rd Qu.: 600.0   3rd Qu.:1700  
 Max.   :21700   Max.   :8124   Max.   :2340.0   Max.   :6800  
      PhD            Terminal       S.F.Ratio      perc.alumni   
 Min.   :  8.00   Min.   : 24.0   Min.   : 2.50   Min.   : 0.00  
 1st Qu.: 62.00   1st Qu.: 71.0   1st Qu.:11.50   1st Qu.:13.00  
 Median : 75.00   Median : 82.0   Median :13.60   Median :21.00  
 Mean   : 72.66   Mean   : 79.7   Mean   :14.09   Mean   :22.74  
 3rd Qu.: 85.00   3rd Qu.: 92.0   3rd Qu.:16.50   3rd Qu.:31.00  
 Max.   :103.00   Max.   :100.0   Max.   :39.80   Max.   :64.00  
     Expend        Grad.Rate     
 Min.   : 3186   Min.   : 10.00  
 1st Qu.: 6751   1st Qu.: 53.00  
 Median : 8377   Median : 65.00  
 Mean   : 9660   Mean   : 65.46  
 3rd Qu.:10830   3rd Qu.: 78.00  
 Max.   :56233   Max.   :118.00  

Note that Private is qualitative variable, and we need to tell R to treat it as such.

private <- as.factor(college$Private)

Since our goal is conduct EDA and we did not really start with a goal for the data apart from just conducting EDA, let’s take a look at a select few features.

NOTE: This is only for demonstration, if we have a large number of features, we should select a few key features. In the real world having motivation or end goal in mind for the data set is key to analyzing features.

Feature Analysis

Private

plot(private, col= c("#7F8C8D", "#2C3E50"), 
     ylim=c(0, 600), xlab="Private Status", 
     ylab="Count", main="Private vs Non-Private")

We notice that vast majority of schools in our data set are private. We can find out exactly how many schools are private vs non-private.

summary(private)
 No Yes 
212 565 

In our data set we have 565 private schools and the remaining 212 are non-private.

private_table <- table(private)
prop.table(private_table)
private
       No       Yes 
0.2728443 0.7271557 

We notice that 72.71% of schools in our data set are private and the remaining are non-private. According to the U.S. News, the U.S. Department of Education lists nearly 4,000 degree-granting academic institutions. In our data set we have 777 of these institutions.

We have enough data to infer that the vast majority of educational institutions in the U.S. are private. Caveat: we don’t have a lot of information on how the data was collected and what biases it may contain.

Apps

To analyze applications, we can use a histogram as a tool. Histograms are graphs that display the distribution of your continuous data.

hist(college$Apps, col="#2980B9", breaks=30,
     ylim=c(0, 500), xlim=c(0, 25000), 
     xlab="Applications", ylab="Frequency", 
     main="Number of College Applications")

The distribution above does not resemble normal distribution. It appears more like exponential distribution. Let’s overlay both exponential and normal distribution curve on top of the histogram of Applications. Before doing that, let’s find out mean and standard deviation of Applications.

mew <- mean(college$Apps)
std <- sd(college$Apps)
mew;std
[1] 3001.638
[1] 3870.201

This tells us that on average a college receives 3000 applications per application term.

hist(college$Apps, col="#B2AAAF", 
     breaks=30, freq=F, xlim=c(0, 25000),
     xlab="Applications", ylab="Density", 
     main="Number of College Applications")
curve(dnorm(x,mew,std),col="#5AC18E",lwd=2,add=T)
curve(dexp(x,rate=1/mew),col="black",lwd=2,add=T)

We notice that college applications follow exponential distribution than normal distribution as exponential distribution better fits the data and describes the data well.

Exponential distribution often concerns itself with the amount of time until some specific event occurs. College applications are periodic, which would make sense why applications follow exponential distribution.

Let’s see how applications received by private vs non-private colleges vary. For this we can use a boxplot.

plot(private, college$Apps, col="#A569BD", varwidth=F, horizontal=T,
     ylab="Private Status", xlab="Number of Applications", main="Private vs Non-Private College Applications Boxplot")

We notice that non-private schools tend to receive more applications than the private ones. Maximum number of applications received by non-private schools are close to 50,000 whereas this number is only approximately 20,000 for private schools. There are also a lot of outliers for private schools.

Accept

Let’s take a look at how many applicants were actually accepted by the universities. We can also plot a histogram for this feature.

hist(college$Accept, col="#DAA06D", breaks=30,
     xlim=c(0, 20000), ylim=c(0, 400),
     xlab="Accpetances", ylab="Frequency", main="College Acceptances")

Let’s check what the mean and standard deviation look like for college acceptances.

mean(college$Accept)
[1] 2018.804
sd(college$Accept)
[1] 2451.114

The histogram appears to follow exponential distribution. We can overlay two histograms and see what they look like.

hist(college$Apps, col="#2980B9", breaks=30,
     ylim=c(0, 500), xlim=c(0, 25000), 
     xlab="Applications", ylab="Frequency", 
     main="Number of College Applications")

hist(college$Accept, col="#DAA06D", breaks=30,
     xlim=c(0, 20000), ylim=c(0, 400),
     xlab="Accpetances", ylab="Frequency", 
     main="College Acceptances", add=T)

This is natural to see. The histograms show us that colleges receive a lot of applications, and only few people are accepted. However, an interesting question arises: how do acceptances vary for private schools and non-private schools? We can guess that because private schools receive fewer applications, they accept fewer people into their programs.

plot(private, college$Accept, col="#967969", varwidth=F, horizontal=T,
     ylab="Private Status", xlab="Number of Acceptances", 
     main="Private vs Non-Private College Acceptances Boxplot")

The boxplot of applications and acceptances looks quite similar. We can view them side-by-side. This is to be expected because as we saw above, the distributions are quite similar too.

par(mfrow=c(1,2))

plot(private, college$Apps, col="#A569BD", varwidth=T, horizontal=F,
     ylab="Private Status", xlab="Number of Applications", 
     main="Private vs Non-Private College Applications Boxplot", 
     cex.main=0.6, cex.lab=0.6, cex.axis=0.6)

plot(private, college$Accept, col="#967969", varwidth=T, horizontal=F,
     ylab="Private Status", xlab="Number of Acceptances", 
     main="Private vs Non-Private College Acceptances Boxplot",
     cex.main=0.6, cex.lab=0.6, cex.axis=0.6)

Enroll

For this feature we can hypothesize that students send a lot of applications, a few of those applications are accepted, and a fewer number of students actually enroll into a program. The distribution will yet again be similar to the histograms we saw above.

hist(college$Enroll, col="#DAA08A", breaks=30,
     xlim=c(0, 7000), ylim=c(0, 250),
     xlab="Enrolments", ylab="Frequency", 
     main="College Enrolments")

Here, we can ask the following questions:

  1. What is the probability of a student applying to a private school?
total_applications <- sum(college$Apps)
private_applications <- sum(college[college$Private == "Yes", ]$Apps)
prob_private_app <- private_applications/total_applications
prob_private_app
[1] 0.4791592
  1. What is the probability of being accepted given application to a private school?

Probability of getting accepted.

prob_acceptance <- sum(college$Accept)/total_applications
prob_acceptance
[1] 0.6725675

Probability of getting accepted by a private school.

prob_private_acceptance <- sum(college[college$Private == "Yes", ]$Accept)/private_applications
prob_private_acceptance
[1] 0.6601362

Probability of applying to a private and getting accepted

prob_private_app_and_private_acceptance <- prob_private_app * prob_private_acceptance
prob_private_app_and_private_acceptance
[1] 0.3163103

Probability of application given acceptance by private schools.

prob_private_app_given_private_acceptance <- prob_private_app_and_private_acceptance/prob_acceptance
prob_private_app_given_private_acceptance
[1] 0.4703027

Probability of acceptance given application to private schools.

prob_private_acceptance_given_private_app <- (prob_private_app_given_private_acceptance * prob_private_acceptance)/prob_private_app
prob_private_acceptance_given_private_app
[1] 0.6479347

This is surprisingly good. Given that I have applied to a private school, I have 0.6479347 chance of being accepted. But this is somewhat skewed because there are a lot of private schools in the U.S. Let’s find out what the probability is for elite schools.

Elite

We do not have an elite feature but this can be engineered. Assuming that most elite schools have students who were top 10 percent in their prior education, we have a Top10perc feature, which we can use.

Elite <- rep("No", nrow(college))
Elite[college$Top10perc > 50] <- "Yes"
Elite <- as.factor(Elite)

Let’s check how many private elite schools are there in the U.S.

plot(Elite, col=c("#1e81b0", "#38b01e"), 
     ylim=c(0, 700),
     xlab="Elite School", ylab="Count", main="Elite Schools in the U.S.")

Let’s find out exactly how many Elite schools are there in the U.S.

summary(Elite)
 No Yes 
699  78 

Let’s find out top elite schools that receive most applications.

college$Elite <- Elite
elite_schools <- college[college$Elite == "Yes", ]
elite_schools[order(-elite_schools[,2]),][c(2)]

We can see that UC Berkely receives most applications out of all the elite schools. Let’s check the acceptance rate of elite schools.

acceptance_rate <- elite_schools$Accept/elite_schools$Apps
elite_schools$Accept.Rate <- acceptance_rate
elite_schools[order(elite_schools[,20]),][c(20)]

Princeton has the lowest acceptance rate out of all the schools in our data set. Only 15.44863% of applicants are accepted.

LS0tCnRpdGxlOiAiVS5TLiBVbml2ZXJzaXR5L0NvbGxlZ2VzIERhdGEgU2V0IgphdXRob3I6IFBhcmFzIEFodWphCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KPHN0eWxlIHR5cGU9InRleHQvY3NzIj4KCmJvZHksIHRkIHsKICAgZm9udC1zaXplOiAxNnB4Owp9CmNvZGUucnsKICBmb250LXNpemU6IDE2cHg7Cn0KcHJlIHsKICBmb250LXNpemU6IDE2cHgKfQo8L3N0eWxlPgoKIyMgTW90aXZhdGlvbiBhbmQgR29hbHMKCkluIHRoaXMgbm90ZWJvb2sgd2UgYW5hbHl6ZSBVLlMuIHVuaXZlcnNpdGllcyBhbmQgY29sbGVnZXMuIERhdGEgZmlsZSBpcyBuYW1lZCAnQ29sbGVnZS5jc3YnIGFuZCBpdCBjYW4gYmUgZWFzaWx5IG9idGFpbmVkIGZyb20gW2hlcmVdKGh0dHBzOi8vd3d3LnN0YXRsZWFybmluZy5jb20vcmVzb3VyY2VzLWZpcnN0LWVkaXRpb24pLiAKCk91ciBnb2FsIGZvciB0aGlzIGRhdGEgc2V0IGlzIHRvIHBlcmZvcm0gKipFeHBsb3JhdG9yeSBEYXRhIEFuYWx5c2lzKiogKEVEQSkuIEF0IHRoaXMgcG9pbnQgd2UgZG8gbm90IHNlZWsgdG8gYnVpbGQgYW55IHByZWRpY3RpdmUgbW9kZWxzLCB3ZSBhcmUgc2ltcGx5IGxvb2tpbmcgdG8gZ2FpbiBzb21lIGluc2lnaHRzIGZyb20gdGhlIGRhdGEgc2V0LiAKCk5vcm1hbGx5LCBkYXRhIHNjaWVuY2UgcHJvY2VzcyByZXF1aXJlcyBjbGVhbmluZyB0aGUgZGF0YSBhbmQgaW1wdXRpbmcgbWlzc2luZyB2YWx1ZXMuIEhvd2V2ZXIsIHRoZSBwcm92aWRlcnMgb2YgdGhpcyBkYXRhIHNldCBoYXZlIGVuc3VyZWQgaXRzIGNvbXBsZXRlbmVzcyBhbmQgdGhlcmUgYXJlICoqbm8qKiBtaXNzaW5nIHZhbHVlcyBpbiB0aGlzIGRhdGEgc2V0LiBSZWdhcmRsZXNzIG9mIHRoZSBkYXRhIHByb3ZpZGVyJ3Mga2luZG5lc3MsIHdlIHdpbGwgZm9sbG93IGRhdGEgc2NpZW5jZSBldGlxdWV0dGUsIHdoaWNoIGludm9sdmVzIHRyZWF0aW5nIHRoZSBkYXRhIGFzIGlmIGl0IGNvbnRhaW5lZCBtaXNzaW5nIHZhbHVlcyBhbmQgcmVxdWlyZWQgY2xlYW5pbmcuCgojIyBFeHBsb3JhdG9yeSBEYXRhIEFuYWx5c2lzCgpUbyBhaWQgb3VyIGFuYWx5c2lzIHdlIGJlZ2luIGJ5IGxvYWRpbmcgdGhlIGRhdGEgc2V0IGludG8gb3VyIFJTdHVkaW8gZW52aXJvbm1lbnQuCgpgYGB7cn0KY29sbGVnZSA8LSByZWFkLmNzdigiQ29sbGVnZS5jc3YiLCBoZWFkZXIgPSBULCBuYS5zdHJpbmdzID0gYygiIiwgIk5BIikpCmhlYWQoY29sbGVnZSkKYGBgCgpOZXh0IGxldHMgY2hlY2sgdGhlIHNoYXBlIG9mIG91ciBkYXRhIHNldCBvciB0aGUgZGltZW5zaW9ucyBvZiB0aGUgZGF0YSBzZXQuIAoKYGBge3J9CmRpbShjb2xsZWdlKQpgYGAKCk5vdyB0aGF0IHdlIGhhdmUgbG9hZGVkIHRoZSBkYXRhIHNldCBpbnRvIG91ciBlbnZpcm9ubWVudCwgd2UgY2FuIGNoZWNrIGlmIHRoZXJlIGFyZSBhbnkgbWlzc2luZyB2YWx1ZXMuIEhvd2V2ZXIsIGFzIHN0YXRlZCBlYXJsaWVyLCB0aGUgZGF0YSBzZXQgaXMgY2xlYW4gYW5kIGNvbXBsZXRlIC0gd2UgY2hlY2sgZm9yIGdvb2QgbWVhc3VyZSBhbmQgcHJhY3RpY2UuIAoKYGBge3J9CnN1bShpcy5uYShjb2xsZWdlKSkKYGBgCkFzIHdlIGNhbiBzZWUgdGhlIGFib3ZlIGZ1bmN0aW9uIHJldHVybnMgdXMgYSB2YWx1ZSBvZiAwLCBpbmRpY2F0aW5nIHRoYXQgdGhlcmUgYXJlIG5vIG1pc3NpbmcgdmFsdWVzIGluIG91ciBkYXRhIHNldCBvciB2YWx1ZXMgbWFya2VkIGFzIE5BLgoKQWxzbyBub3RlIHRoYXQgd2hlbiB3ZSBsb29rIGF0IHRoZSBzYW1wbGUgb2YgdGhlIGRhdGEgc2V0IHdlIHNlZSB0aGF0ICoqWCoqIHdpbGwgYWN0dWFsbHkgZ2V0IHRyZWF0ZWQgYXMgYSBwYXJ0IG9mIHRoZSBkYXRhIHNldCBmb3IgY2FsY3VsYXRpb25zLCBhbmQgd2Ugd2FudCBpdCB0byBiZSB0cmVhdGVkIGFzIHRoZSBuYW1lIG9mIHJvd3MuIFdlIGNhbiBjaGFuZ2UgdGhpcyBieSBhc3NpZ25pbmcgY29sdW1uIDEgYXMgcm93IG5hbWVzIGFuZCBkZWxldGluZyBjb2x1bW4gMSBiZWNhdXNlIHJvd25hbWVzIHdpbGwgcmVwcmVzZW50IHRoaXMgaW5mb3JtYXRpb24uCgpgYGB7cn0Kcm93bmFtZXMoY29sbGVnZSkgPC0gY29sbGVnZVssMV0KY29sbGVnZSA8LSBjb2xsZWdlWywgLTFdCmhlYWQoY29sbGVnZSkKYGBgCgpBcyB3ZSBjYW4gc2VlLCBmaXJzdCBjb2x1bW4gY2hhbmdlZCBmcm9tICoqWCoqIHRvIHJvdyBuYW1lcyAodGhpcyBtZWFucyB0aGF0IHRoZSBuYW1lcyBhcmUgbm8gbG9uZ2VyIHRyZWF0ZWQgYXMgYSBmZWF0dXJlIHZlY3Rvciwgd2hpY2ggaXMgd2h5ICoqWCoqIGlzIHJlcGxhY2VkIGJ5IGEgd2hpdGVzcGFjZSkuIAoKTGV0J3MgdGFrZSBhIGxvb2sgYXQgdGhlIHN1bW1hcnkgb2YgdGhpcyBkYXRhIHNldC4KCmBgYHtyfQpzdW1tYXJ5KGNvbGxlZ2UpCmBgYApOb3RlIHRoYXQgUHJpdmF0ZSBpcyBxdWFsaXRhdGl2ZSB2YXJpYWJsZSwgYW5kIHdlIG5lZWQgdG8gdGVsbCBSIHRvIHRyZWF0IGl0IGFzIHN1Y2guCgpgYGB7cn0KcHJpdmF0ZSA8LSBhcy5mYWN0b3IoY29sbGVnZSRQcml2YXRlKQpgYGAKClNpbmNlIG91ciBnb2FsIGlzIGNvbmR1Y3QgRURBIGFuZCB3ZSBkaWQgbm90IHJlYWxseSBzdGFydCB3aXRoIGEgZ29hbCBmb3IgdGhlIGRhdGEgYXBhcnQgZnJvbSBqdXN0IGNvbmR1Y3RpbmcgRURBLCBsZXQncyB0YWtlIGEgbG9vayBhdCBhIHNlbGVjdCBmZXcgZmVhdHVyZXMuCgoqKk5PVEU6IFRoaXMgaXMgb25seSBmb3IgZGVtb25zdHJhdGlvbiwgaWYgd2UgaGF2ZSBhIGxhcmdlIG51bWJlciBvZiBmZWF0dXJlcywgd2Ugc2hvdWxkIHNlbGVjdCBhIGZldyBrZXkgZmVhdHVyZXMuIEluIHRoZSByZWFsIHdvcmxkIGhhdmluZyBtb3RpdmF0aW9uIG9yIGVuZCBnb2FsIGluIG1pbmQgZm9yIHRoZSBkYXRhIHNldCBpcyBrZXkgdG8gYW5hbHl6aW5nIGZlYXR1cmVzLioqCgojIyMgRmVhdHVyZSBBbmFseXNpcwoKIyMjIyBQcml2YXRlCmBgYHtyfQpwbG90KHByaXZhdGUsIGNvbD0gYygiIzdGOEM4RCIsICIjMkMzRTUwIiksIAogICAgIHlsaW09YygwLCA2MDApLCB4bGFiPSJQcml2YXRlIFN0YXR1cyIsIAogICAgIHlsYWI9IkNvdW50IiwgbWFpbj0iUHJpdmF0ZSB2cyBOb24tUHJpdmF0ZSIpCmBgYApXZSBub3RpY2UgdGhhdCB2YXN0IG1ham9yaXR5IG9mIHNjaG9vbHMgaW4gb3VyIGRhdGEgc2V0IGFyZSBwcml2YXRlLiBXZSBjYW4gZmluZCBvdXQgZXhhY3RseSBob3cgbWFueSBzY2hvb2xzIGFyZSBwcml2YXRlIHZzIG5vbi1wcml2YXRlLgoKYGBge3J9CnN1bW1hcnkocHJpdmF0ZSkKYGBgCkluIG91ciBkYXRhIHNldCB3ZSBoYXZlIDU2NSBwcml2YXRlIHNjaG9vbHMgYW5kIHRoZSByZW1haW5pbmcgMjEyIGFyZSBub24tcHJpdmF0ZS4KCmBgYHtyfQpwcml2YXRlX3RhYmxlIDwtIHRhYmxlKHByaXZhdGUpCnByb3AudGFibGUocHJpdmF0ZV90YWJsZSkKYGBgCldlIG5vdGljZSB0aGF0IDcyLjcxJSBvZiBzY2hvb2xzIGluIG91ciBkYXRhIHNldCBhcmUgcHJpdmF0ZSBhbmQgdGhlIHJlbWFpbmluZyBhcmUgbm9uLXByaXZhdGUuIEFjY29yZGluZyB0byB0aGUgW1UuUy4gTmV3c10oaHR0cHM6Ly93d3cudXNuZXdzLmNvbS9lZHVjYXRpb24vYmVzdC1jb2xsZWdlcy9hcnRpY2xlcy9ob3ctbWFueS11bml2ZXJzaXRpZXMtYXJlLWluLXRoZS11cy1hbmQtd2h5LXRoYXQtbnVtYmVyLWlzLWNoYW5naW5nKSwgdGhlIFUuUy4gRGVwYXJ0bWVudCBvZiBFZHVjYXRpb24gbGlzdHMgbmVhcmx5IDQsMDAwIGRlZ3JlZS1ncmFudGluZyBhY2FkZW1pYyBpbnN0aXR1dGlvbnMuIEluIG91ciBkYXRhIHNldCB3ZSBoYXZlIDc3NyBvZiB0aGVzZSBpbnN0aXR1dGlvbnMuIAoKV2UgaGF2ZSBlbm91Z2ggZGF0YSB0byBpbmZlciB0aGF0IHRoZSB2YXN0IG1ham9yaXR5IG9mIGVkdWNhdGlvbmFsIGluc3RpdHV0aW9ucyBpbiB0aGUgVS5TLiBhcmUgcHJpdmF0ZS4gKipDYXZlYXQ6IHdlIGRvbid0IGhhdmUgYSBsb3Qgb2YgaW5mb3JtYXRpb24gb24gaG93IHRoZSBkYXRhIHdhcyBjb2xsZWN0ZWQgYW5kIHdoYXQgYmlhc2VzIGl0IG1heSBjb250YWluLioqCgojIyMjIEFwcHMKVG8gYW5hbHl6ZSBhcHBsaWNhdGlvbnMsIHdlIGNhbiB1c2UgYSBoaXN0b2dyYW0gYXMgYSB0b29sLiBIaXN0b2dyYW1zIGFyZSBncmFwaHMgdGhhdCBkaXNwbGF5IHRoZSBkaXN0cmlidXRpb24gb2YgeW91ciBjb250aW51b3VzIGRhdGEuIApgYGB7cn0KaGlzdChjb2xsZWdlJEFwcHMsIGNvbD0iIzI5ODBCOSIsIGJyZWFrcz0zMCwKICAgICB5bGltPWMoMCwgNTAwKSwgeGxpbT1jKDAsIDI1MDAwKSwgCiAgICAgeGxhYj0iQXBwbGljYXRpb25zIiwgeWxhYj0iRnJlcXVlbmN5IiwgCiAgICAgbWFpbj0iTnVtYmVyIG9mIENvbGxlZ2UgQXBwbGljYXRpb25zIikKCmBgYApUaGUgZGlzdHJpYnV0aW9uIGFib3ZlIGRvZXMgbm90IHJlc2VtYmxlIG5vcm1hbCBkaXN0cmlidXRpb24uIEl0IGFwcGVhcnMgbW9yZSBsaWtlIGV4cG9uZW50aWFsIGRpc3RyaWJ1dGlvbi4gTGV0J3Mgb3ZlcmxheSBib3RoIGV4cG9uZW50aWFsIGFuZCBub3JtYWwgZGlzdHJpYnV0aW9uIGN1cnZlIG9uIHRvcCBvZiB0aGUgaGlzdG9ncmFtIG9mIEFwcGxpY2F0aW9ucy4gQmVmb3JlIGRvaW5nIHRoYXQsIGxldCdzIGZpbmQgb3V0IG1lYW4gYW5kIHN0YW5kYXJkIGRldmlhdGlvbiBvZiBBcHBsaWNhdGlvbnMuCgpgYGB7cn0KbWV3IDwtIG1lYW4oY29sbGVnZSRBcHBzKQpzdGQgPC0gc2QoY29sbGVnZSRBcHBzKQptZXc7c3RkCmBgYApUaGlzIHRlbGxzIHVzIHRoYXQgb24gYXZlcmFnZSBhIGNvbGxlZ2UgcmVjZWl2ZXMgMzAwMCBhcHBsaWNhdGlvbnMgcGVyIGFwcGxpY2F0aW9uIHRlcm0uIAoKYGBge3J9Cmhpc3QoY29sbGVnZSRBcHBzLCBjb2w9IiNCMkFBQUYiLCAKICAgICBicmVha3M9MzAsIGZyZXE9RiwgeGxpbT1jKDAsIDI1MDAwKSwKICAgICB4bGFiPSJBcHBsaWNhdGlvbnMiLCB5bGFiPSJEZW5zaXR5IiwgCiAgICAgbWFpbj0iTnVtYmVyIG9mIENvbGxlZ2UgQXBwbGljYXRpb25zIikKY3VydmUoZG5vcm0oeCxtZXcsc3RkKSxjb2w9IiM1QUMxOEUiLGx3ZD0yLGFkZD1UKQpjdXJ2ZShkZXhwKHgscmF0ZT0xL21ldyksY29sPSJibGFjayIsbHdkPTIsYWRkPVQpCmBgYApXZSBub3RpY2UgdGhhdCBjb2xsZWdlIGFwcGxpY2F0aW9ucyBmb2xsb3cgZXhwb25lbnRpYWwgZGlzdHJpYnV0aW9uIHRoYW4gbm9ybWFsIGRpc3RyaWJ1dGlvbiBhcyBleHBvbmVudGlhbCBkaXN0cmlidXRpb24gYmV0dGVyIGZpdHMgdGhlIGRhdGEgYW5kIGRlc2NyaWJlcyB0aGUgZGF0YSB3ZWxsLiAKCkV4cG9uZW50aWFsIGRpc3RyaWJ1dGlvbiBvZnRlbiBjb25jZXJucyBpdHNlbGYgd2l0aCB0aGUgYW1vdW50IG9mIHRpbWUgdW50aWwgc29tZSBzcGVjaWZpYyBldmVudCBvY2N1cnMuIENvbGxlZ2UgYXBwbGljYXRpb25zIGFyZSBwZXJpb2RpYywgd2hpY2ggd291bGQgbWFrZSBzZW5zZSB3aHkgYXBwbGljYXRpb25zIGZvbGxvdyBleHBvbmVudGlhbCBkaXN0cmlidXRpb24uCgpMZXQncyBzZWUgaG93IGFwcGxpY2F0aW9ucyByZWNlaXZlZCBieSBwcml2YXRlIHZzIG5vbi1wcml2YXRlIGNvbGxlZ2VzIHZhcnkuIEZvciB0aGlzIHdlIGNhbiB1c2UgYSBib3hwbG90LiAKCmBgYHtyfQpwbG90KHByaXZhdGUsIGNvbGxlZ2UkQXBwcywgY29sPSIjQTU2OUJEIiwgdmFyd2lkdGg9RiwgaG9yaXpvbnRhbD1ULAogICAgIHlsYWI9IlByaXZhdGUgU3RhdHVzIiwgeGxhYj0iTnVtYmVyIG9mIEFwcGxpY2F0aW9ucyIsIG1haW49IlByaXZhdGUgdnMgTm9uLVByaXZhdGUgQ29sbGVnZSBBcHBsaWNhdGlvbnMgQm94cGxvdCIpCmBgYApXZSBub3RpY2UgdGhhdCBub24tcHJpdmF0ZSBzY2hvb2xzIHRlbmQgdG8gcmVjZWl2ZSBtb3JlIGFwcGxpY2F0aW9ucyB0aGFuIHRoZSBwcml2YXRlIG9uZXMuIE1heGltdW0gbnVtYmVyIG9mIGFwcGxpY2F0aW9ucyByZWNlaXZlZCBieSBub24tcHJpdmF0ZSBzY2hvb2xzIGFyZSBjbG9zZSB0byA1MCwwMDAgd2hlcmVhcyB0aGlzIG51bWJlciBpcyBvbmx5IGFwcHJveGltYXRlbHkgMjAsMDAwIGZvciBwcml2YXRlIHNjaG9vbHMuIFRoZXJlIGFyZSBhbHNvIGEgbG90IG9mIG91dGxpZXJzIGZvciBwcml2YXRlIHNjaG9vbHMuCgojIyMjIEFjY2VwdAoKTGV0J3MgdGFrZSBhIGxvb2sgYXQgaG93IG1hbnkgYXBwbGljYW50cyB3ZXJlIGFjdHVhbGx5IGFjY2VwdGVkIGJ5IHRoZSB1bml2ZXJzaXRpZXMuIFdlIGNhbiBhbHNvIHBsb3QgYSBoaXN0b2dyYW0gZm9yIHRoaXMgZmVhdHVyZS4gCgpgYGB7cn0KaGlzdChjb2xsZWdlJEFjY2VwdCwgY29sPSIjREFBMDZEIiwgYnJlYWtzPTMwLAogICAgIHhsaW09YygwLCAyMDAwMCksIHlsaW09YygwLCA0MDApLAogICAgIHhsYWI9IkFjY3BldGFuY2VzIiwgeWxhYj0iRnJlcXVlbmN5IiwgbWFpbj0iQ29sbGVnZSBBY2NlcHRhbmNlcyIpCmBgYApMZXQncyBjaGVjayB3aGF0IHRoZSBtZWFuIGFuZCBzdGFuZGFyZCBkZXZpYXRpb24gbG9vayBsaWtlIGZvciBjb2xsZWdlIGFjY2VwdGFuY2VzLgoKYGBge3J9Cm1lYW4oY29sbGVnZSRBY2NlcHQpCnNkKGNvbGxlZ2UkQWNjZXB0KQpgYGAKVGhlIGhpc3RvZ3JhbSBhcHBlYXJzIHRvIGZvbGxvdyBleHBvbmVudGlhbCBkaXN0cmlidXRpb24uIFdlIGNhbiBvdmVybGF5IHR3byBoaXN0b2dyYW1zIGFuZCBzZWUgd2hhdCB0aGV5IGxvb2sgbGlrZS4KCmBgYHtyfQpoaXN0KGNvbGxlZ2UkQXBwcywgY29sPSIjMjk4MEI5IiwgYnJlYWtzPTMwLAogICAgIHlsaW09YygwLCA1MDApLCB4bGltPWMoMCwgMjUwMDApLCAKICAgICB4bGFiPSJBcHBsaWNhdGlvbnMiLCB5bGFiPSJGcmVxdWVuY3kiLCAKICAgICBtYWluPSJOdW1iZXIgb2YgQ29sbGVnZSBBcHBsaWNhdGlvbnMiKQoKaGlzdChjb2xsZWdlJEFjY2VwdCwgY29sPSIjREFBMDZEIiwgYnJlYWtzPTMwLAogICAgIHhsaW09YygwLCAyMDAwMCksIHlsaW09YygwLCA0MDApLAogICAgIHhsYWI9IkFjY3BldGFuY2VzIiwgeWxhYj0iRnJlcXVlbmN5IiwgCiAgICAgbWFpbj0iQ29sbGVnZSBBY2NlcHRhbmNlcyIsIGFkZD1UKQpgYGAKVGhpcyBpcyBuYXR1cmFsIHRvIHNlZS4gVGhlIGhpc3RvZ3JhbXMgc2hvdyB1cyB0aGF0IGNvbGxlZ2VzIHJlY2VpdmUgYSBsb3Qgb2YgYXBwbGljYXRpb25zLCBhbmQgb25seSBmZXcgcGVvcGxlIGFyZSBhY2NlcHRlZC4gSG93ZXZlciwgYW4gaW50ZXJlc3RpbmcgcXVlc3Rpb24gYXJpc2VzOiBob3cgZG8gYWNjZXB0YW5jZXMgdmFyeSBmb3IgcHJpdmF0ZSBzY2hvb2xzIGFuZCBub24tcHJpdmF0ZSBzY2hvb2xzPyBXZSBjYW4gZ3Vlc3MgdGhhdCBiZWNhdXNlIHByaXZhdGUgc2Nob29scyByZWNlaXZlIGZld2VyIGFwcGxpY2F0aW9ucywgdGhleSBhY2NlcHQgZmV3ZXIgcGVvcGxlIGludG8gdGhlaXIgcHJvZ3JhbXMuCgpgYGB7cn0KcGxvdChwcml2YXRlLCBjb2xsZWdlJEFjY2VwdCwgY29sPSIjOTY3OTY5IiwgdmFyd2lkdGg9RiwgaG9yaXpvbnRhbD1ULAogICAgIHlsYWI9IlByaXZhdGUgU3RhdHVzIiwgeGxhYj0iTnVtYmVyIG9mIEFjY2VwdGFuY2VzIiwgCiAgICAgbWFpbj0iUHJpdmF0ZSB2cyBOb24tUHJpdmF0ZSBDb2xsZWdlIEFjY2VwdGFuY2VzIEJveHBsb3QiKQpgYGAKVGhlIGJveHBsb3Qgb2YgYXBwbGljYXRpb25zIGFuZCBhY2NlcHRhbmNlcyBsb29rcyBxdWl0ZSBzaW1pbGFyLiBXZSBjYW4gdmlldyB0aGVtIHNpZGUtYnktc2lkZS4gVGhpcyBpcyB0byBiZSBleHBlY3RlZCBiZWNhdXNlIGFzIHdlIHNhdyBhYm92ZSwgdGhlIGRpc3RyaWJ1dGlvbnMgYXJlIHF1aXRlIHNpbWlsYXIgdG9vLgoKYGBge3J9CnBhcihtZnJvdz1jKDEsMikpCgpwbG90KHByaXZhdGUsIGNvbGxlZ2UkQXBwcywgY29sPSIjQTU2OUJEIiwgdmFyd2lkdGg9VCwgaG9yaXpvbnRhbD1GLAogICAgIHlsYWI9IlByaXZhdGUgU3RhdHVzIiwgeGxhYj0iTnVtYmVyIG9mIEFwcGxpY2F0aW9ucyIsIAogICAgIG1haW49IlByaXZhdGUgdnMgTm9uLVByaXZhdGUgQ29sbGVnZSBBcHBsaWNhdGlvbnMgQm94cGxvdCIsIAogICAgIGNleC5tYWluPTAuNiwgY2V4LmxhYj0wLjYsIGNleC5heGlzPTAuNikKCnBsb3QocHJpdmF0ZSwgY29sbGVnZSRBY2NlcHQsIGNvbD0iIzk2Nzk2OSIsIHZhcndpZHRoPVQsIGhvcml6b250YWw9RiwKICAgICB5bGFiPSJQcml2YXRlIFN0YXR1cyIsIHhsYWI9Ik51bWJlciBvZiBBY2NlcHRhbmNlcyIsIAogICAgIG1haW49IlByaXZhdGUgdnMgTm9uLVByaXZhdGUgQ29sbGVnZSBBY2NlcHRhbmNlcyBCb3hwbG90IiwKICAgICBjZXgubWFpbj0wLjYsIGNleC5sYWI9MC42LCBjZXguYXhpcz0wLjYpCgpgYGAKCiMjIyMgRW5yb2xsCgpGb3IgdGhpcyBmZWF0dXJlIHdlIGNhbiBoeXBvdGhlc2l6ZSB0aGF0IHN0dWRlbnRzIHNlbmQgYSBsb3Qgb2YgYXBwbGljYXRpb25zLCBhIGZldyBvZiB0aG9zZSBhcHBsaWNhdGlvbnMgYXJlIGFjY2VwdGVkLCBhbmQgYSBmZXdlciBudW1iZXIgb2Ygc3R1ZGVudHMgYWN0dWFsbHkgZW5yb2xsIGludG8gYSBwcm9ncmFtLiBUaGUgZGlzdHJpYnV0aW9uIHdpbGwgeWV0IGFnYWluIGJlIHNpbWlsYXIgdG8gdGhlIGhpc3RvZ3JhbXMgd2Ugc2F3IGFib3ZlLiAKCmBgYHtyfQpoaXN0KGNvbGxlZ2UkRW5yb2xsLCBjb2w9IiNEQUEwOEEiLCBicmVha3M9MzAsCiAgICAgeGxpbT1jKDAsIDcwMDApLCB5bGltPWMoMCwgMjUwKSwKICAgICB4bGFiPSJFbnJvbG1lbnRzIiwgeWxhYj0iRnJlcXVlbmN5IiwgCiAgICAgbWFpbj0iQ29sbGVnZSBFbnJvbG1lbnRzIikKYGBgCkhlcmUsIHdlIGNhbiBhc2sgdGhlIGZvbGxvd2luZyBxdWVzdGlvbnM6CgoxLiAqKldoYXQgaXMgdGhlIHByb2JhYmlsaXR5IG9mIGEgc3R1ZGVudCBhcHBseWluZyB0byBhIHByaXZhdGUgc2Nob29sPyoqCmBgYHtyfQp0b3RhbF9hcHBsaWNhdGlvbnMgPC0gc3VtKGNvbGxlZ2UkQXBwcykKcHJpdmF0ZV9hcHBsaWNhdGlvbnMgPC0gc3VtKGNvbGxlZ2VbY29sbGVnZSRQcml2YXRlID09ICJZZXMiLCBdJEFwcHMpCnByb2JfcHJpdmF0ZV9hcHAgPC0gcHJpdmF0ZV9hcHBsaWNhdGlvbnMvdG90YWxfYXBwbGljYXRpb25zCnByb2JfcHJpdmF0ZV9hcHAKYGBgCjIuICoqV2hhdCBpcyB0aGUgcHJvYmFiaWxpdHkgb2YgYmVpbmcgYWNjZXB0ZWQgZ2l2ZW4gYXBwbGljYXRpb24gdG8gYSBwcml2YXRlIHNjaG9vbD8qKgoKUHJvYmFiaWxpdHkgb2YgZ2V0dGluZyBhY2NlcHRlZC4KYGBge3J9CnByb2JfYWNjZXB0YW5jZSA8LSBzdW0oY29sbGVnZSRBY2NlcHQpL3RvdGFsX2FwcGxpY2F0aW9ucwpwcm9iX2FjY2VwdGFuY2UKYGBgClByb2JhYmlsaXR5IG9mIGdldHRpbmcgYWNjZXB0ZWQgYnkgYSBwcml2YXRlIHNjaG9vbC4KYGBge3J9CnByb2JfcHJpdmF0ZV9hY2NlcHRhbmNlIDwtIHN1bShjb2xsZWdlW2NvbGxlZ2UkUHJpdmF0ZSA9PSAiWWVzIiwgXSRBY2NlcHQpL3ByaXZhdGVfYXBwbGljYXRpb25zCnByb2JfcHJpdmF0ZV9hY2NlcHRhbmNlCmBgYApQcm9iYWJpbGl0eSBvZiBhcHBseWluZyB0byBhIHByaXZhdGUgYW5kIGdldHRpbmcgYWNjZXB0ZWQKYGBge3J9CnByb2JfcHJpdmF0ZV9hcHBfYW5kX3ByaXZhdGVfYWNjZXB0YW5jZSA8LSBwcm9iX3ByaXZhdGVfYXBwICogcHJvYl9wcml2YXRlX2FjY2VwdGFuY2UKcHJvYl9wcml2YXRlX2FwcF9hbmRfcHJpdmF0ZV9hY2NlcHRhbmNlCmBgYApQcm9iYWJpbGl0eSBvZiBhcHBsaWNhdGlvbiBnaXZlbiBhY2NlcHRhbmNlIGJ5IHByaXZhdGUgc2Nob29scy4KCmBgYHtyfQpwcm9iX3ByaXZhdGVfYXBwX2dpdmVuX3ByaXZhdGVfYWNjZXB0YW5jZSA8LSBwcm9iX3ByaXZhdGVfYXBwX2FuZF9wcml2YXRlX2FjY2VwdGFuY2UvcHJvYl9hY2NlcHRhbmNlCnByb2JfcHJpdmF0ZV9hcHBfZ2l2ZW5fcHJpdmF0ZV9hY2NlcHRhbmNlCmBgYApQcm9iYWJpbGl0eSBvZiBhY2NlcHRhbmNlIGdpdmVuIGFwcGxpY2F0aW9uIHRvIHByaXZhdGUgc2Nob29scy4KCmBgYHtyfQpwcm9iX3ByaXZhdGVfYWNjZXB0YW5jZV9naXZlbl9wcml2YXRlX2FwcCA8LSAocHJvYl9wcml2YXRlX2FwcF9naXZlbl9wcml2YXRlX2FjY2VwdGFuY2UgKiBwcm9iX3ByaXZhdGVfYWNjZXB0YW5jZSkvcHJvYl9wcml2YXRlX2FwcApwcm9iX3ByaXZhdGVfYWNjZXB0YW5jZV9naXZlbl9wcml2YXRlX2FwcApgYGAKVGhpcyBpcyBzdXJwcmlzaW5nbHkgZ29vZC4gR2l2ZW4gdGhhdCBJIGhhdmUgYXBwbGllZCB0byBhIHByaXZhdGUgc2Nob29sLCBJIGhhdmUgMC42NDc5MzQ3IGNoYW5jZSBvZiBiZWluZyBhY2NlcHRlZC4gQnV0IHRoaXMgaXMgc29tZXdoYXQgc2tld2VkIGJlY2F1c2UgdGhlcmUgYXJlIGEgbG90IG9mIHByaXZhdGUgc2Nob29scyBpbiB0aGUgVS5TLiBMZXQncyBmaW5kIG91dCB3aGF0IHRoZSBwcm9iYWJpbGl0eSBpcyBmb3IgZWxpdGUgc2Nob29scy4KCiMjIyMgRWxpdGUKCldlIGRvIG5vdCBoYXZlIGFuIGVsaXRlIGZlYXR1cmUgYnV0IHRoaXMgY2FuIGJlICoqZW5naW5lZXJlZCoqLiBBc3N1bWluZyB0aGF0IG1vc3QgZWxpdGUgc2Nob29scyBoYXZlIHN0dWRlbnRzIHdobyB3ZXJlIHRvcCAxMCBwZXJjZW50IGluIHRoZWlyIHByaW9yIGVkdWNhdGlvbiwgd2UgaGF2ZSBhIFRvcDEwcGVyYyBmZWF0dXJlLCB3aGljaCB3ZSBjYW4gdXNlLgoKYGBge3J9CkVsaXRlIDwtIHJlcCgiTm8iLCBucm93KGNvbGxlZ2UpKQpFbGl0ZVtjb2xsZWdlJFRvcDEwcGVyYyA+IDUwXSA8LSAiWWVzIgpFbGl0ZSA8LSBhcy5mYWN0b3IoRWxpdGUpCmBgYAoKTGV0J3MgY2hlY2sgaG93IG1hbnkgcHJpdmF0ZSBlbGl0ZSBzY2hvb2xzIGFyZSB0aGVyZSBpbiB0aGUgVS5TLgoKYGBge3J9CnBsb3QoRWxpdGUsIGNvbD1jKCIjMWU4MWIwIiwgIiMzOGIwMWUiKSwgCiAgICAgeWxpbT1jKDAsIDcwMCksCiAgICAgeGxhYj0iRWxpdGUgU2Nob29sIiwgeWxhYj0iQ291bnQiLCBtYWluPSJFbGl0ZSBTY2hvb2xzIGluIHRoZSBVLlMuIikKYGBgCkxldCdzIGZpbmQgb3V0IGV4YWN0bHkgaG93IG1hbnkgRWxpdGUgc2Nob29scyBhcmUgdGhlcmUgaW4gdGhlIFUuUy4KYGBge3J9CnN1bW1hcnkoRWxpdGUpCmBgYApMZXQncyBmaW5kIG91dCB0b3AgZWxpdGUgc2Nob29scyB0aGF0IHJlY2VpdmUgbW9zdCBhcHBsaWNhdGlvbnMuCmBgYHtyfQpjb2xsZWdlJEVsaXRlIDwtIEVsaXRlCmVsaXRlX3NjaG9vbHMgPC0gY29sbGVnZVtjb2xsZWdlJEVsaXRlID09ICJZZXMiLCBdCmVsaXRlX3NjaG9vbHNbb3JkZXIoLWVsaXRlX3NjaG9vbHNbLDJdKSxdW2MoMildCmBgYAoKV2UgY2FuIHNlZSB0aGF0IFVDIEJlcmtlbHkgcmVjZWl2ZXMgbW9zdCBhcHBsaWNhdGlvbnMgb3V0IG9mIGFsbCB0aGUgZWxpdGUgc2Nob29scy4gTGV0J3MgY2hlY2sgdGhlIGFjY2VwdGFuY2UgcmF0ZSBvZiBlbGl0ZSBzY2hvb2xzLgoKYGBge3J9CmFjY2VwdGFuY2VfcmF0ZSA8LSBlbGl0ZV9zY2hvb2xzJEFjY2VwdC9lbGl0ZV9zY2hvb2xzJEFwcHMKZWxpdGVfc2Nob29scyRBY2NlcHQuUmF0ZSA8LSBhY2NlcHRhbmNlX3JhdGUKZWxpdGVfc2Nob29sc1tvcmRlcihlbGl0ZV9zY2hvb2xzWywyMF0pLF1bYygyMCldCmBgYAoKUHJpbmNldG9uIGhhcyB0aGUgbG93ZXN0IGFjY2VwdGFuY2UgcmF0ZSBvdXQgb2YgYWxsIHRoZSBzY2hvb2xzIGluIG91ciBkYXRhIHNldC4gT25seSAxNS40NDg2MyUgb2YgYXBwbGljYW50cyBhcmUgYWNjZXB0ZWQu