主要議題:依字頻表對文章分群

學習重點:

rm(list=ls(all=T))
Sys.setlocale("LC_ALL","C")
[1] "C"
options(digits=4, scipen=12)
library(dplyr)



1. Hierarchical Clustering

1.1 字頻表、距離矩陣、階層式集群分析

Let’s start by building a hierarchical clustering model. First, read the data set into R. Then, compute the distances (using method=“euclidean”), and use hclust to build the model (using method=“ward.D”). You should cluster on all of the variables.

dailykos = read.csv("data/dailykos.csv")
distance = dist(dailykos,method="euclidean")
clusterdailykos = hclust(distance,method="ward.D")

Running the dist function will probably take you a while. Why? Select all that apply.

  • We have a lot of observations, so it takes a long time to compute the distance between each pair of observations.
  • We have a lot of variables, so the distance computation is long.

Plot the dendrogram of your hierarchical clustering model.

plot(clusterdailykos)
rect.hclust(clusterdailykos, k=7, border="red")

1.2 從樹狀圖判斷群數

Just looking at the dendrogram,

which of the following seem like good choices for the number of clusters? Select all that apply.

  • 2
  • 3
  • 找垂直距離較大的做切割較合適。
1.3 從應用決定群數

In this problem, we are trying to cluster news articles or blog posts into groups. This can be used to show readers categories to choose from when trying to decide what to read. Just thinking about this application,

what are good choices for the number of clusters? Select all that apply.

  • 7
  • 8
1.4 依群組分割資料

Let’s pick 7 clusters. This number is reasonable according to the dendrogram, and also seems reasonable for the application. Use the cutree function to split your data into 7 clusters.

hierGroups = cutree(clusterdailykos, k = 7)
table(hierGroups) %>% sort(T)
hierGroups
   1    6    5    3    2    7    4 
1266  714  407  374  321  209  139 

Now, we don’t really want to run tapply on every single variable when we have over 1,000 different variables. Let’s instead use the subset function to subset our data by cluster. Create 7 new datasets, each containing the observations from one of the clusters.

How many observations are in cluster 3?

table(hierGroups)
hierGroups
   1    2    3    4    5    6    7 
1266  321  374  139  407  714  209 
  • 374

Which cluster has the most observations?

  • 1

Which cluster has the fewest observations?

  • 4
1.5 找出第一族群中最常見的字辭

Instead of looking at the average value in each variable individually, we’ll just look at the top 6 words in each cluster. To do this for cluster 1, type the following in your R console (where “HierCluster1” should be replaced with the name of your first cluster subset):

tail(sort(colMeans(HierCluster1)))

This computes the mean frequency values of each of the words in cluster 1, and then outputs the 6 words that occur the most frequently. The colMeans function computes the column (word) means, the sort function orders the words in increasing order of the mean values, and the tail function outputs the last 6 words listed, which are the ones with the largest column means.

What is the most frequent word in this cluster, in terms of average value? Enter the word exactly how you see it in the output:

HierCluster1 = subset(dailykos, hierGroups == 1)
HierCluster2 = subset(dailykos, hierGroups == 2)
HierCluster3 = subset(dailykos, hierGroups == 3)
HierCluster4 = subset(dailykos, hierGroups == 4)
HierCluster5 = subset(dailykos, hierGroups == 5)
HierCluster6 = subset(dailykos, hierGroups == 6)
HierCluster7 = subset(dailykos, hierGroups == 7)
tail(sort(colMeans(HierCluster1),F))
     state republican       poll   democrat      kerry       bush 
    0.7575     0.7591     0.9036     0.9194     1.0624     1.7054 
  • bush
1.6 找出各族群中最常見的字辭

Now repeat the command given in the previous problem for each of the other clusters, and answer the following questions.

tail(sort(colMeans(HierCluster2),F))
     bush  democrat challenge      vote      poll  november 
    2.847     2.850     4.097     4.399     4.847    10.340 
tapply(dailykos$iraq , hierGroups, mean) %>% sort(T)
     5      3      4      1      2      7      6 
2.4275 0.8182 0.6835 0.5166 0.4237 0.1388 0.1303 

Which words best describe cluster 2?

  • november, poll, vote ,challenge

Which cluster could best be described as the cluster related to the Iraq war?

  • 5

In 2004, one of the candidates for the Democratic nomination for the President of the United States was Howard Dean, John Kerry was the candidate who won the democratic nomination, and John Edwards with the running mate of John Kerry (the Vice President nominee). Given this information,

tail(sort(colMeans(HierCluster2)))
     bush  democrat challenge      vote      poll  november 
    2.847     2.850     4.097     4.399     4.847    10.340 
tail(sort(colMeans(HierCluster3)))
     elect    parties      state republican   democrat       bush 
     1.647      1.666      2.321      2.524      3.824      4.406 
tail(sort(colMeans(HierCluster4)))
campaign    voter presided     poll     bush    kerry 
   1.432    1.540    1.626    3.590    7.835    8.439 
tail(sort(colMeans(HierCluster5)))
      american       presided administration            war           iraq           bush 
         1.091          1.120          1.231          1.776          2.428          3.941 
tail(sort(colMeans(HierCluster6)))
    race     bush    kerry    elect democrat     poll 
  0.4580   0.4888   0.5168   0.5350   0.5644   0.5812 
tail(sort(colMeans(HierCluster7)))
democrat    clark   edward     poll    kerry     dean 
   2.148    2.498    2.608    2.766    3.952    5.804 

which cluster best corresponds to the democratic party?

  • 7



2 K-Means Clustering

2.1 K-Means集群分析

Now, run k-means clustering, setting the seed to 1000 right before you run the kmeans function. Again, pick the number of clusters equal to 7. You don’t need to add the iters.max argument.

set.seed(1000)
KMC = kmeans(dailykos, centers = 7)
table(KMC$cluster) %>% sort(T)

   4    6    7    3    5    1    2 
2063  329  308  277  163  146  144 

Subset your data into the 7 clusters (7 new datasets) by using the “cluster” variable of your kmeans output.

How many observations are in Cluster 3?

  • 277

Which cluster has the most observations?

  • 4

Which cluster has the fewest number of observations?

  • 2
2.2 找出各族群中最常見的字辭

Now, output the six most frequent words in each cluster, like we did in the previous problem, for each of the k-means clusters.

dailykos$cluster = KMC$cluster
dailykos = as.data.frame(dailykos)
sapply(split(dailykos[,1:1545], dailykos$cluster),function(x) names(head(sort(colMeans(x),T)))
  )
     1                2           3                4            5            6           7         
[1,] "bush"           "dean"      "iraq"           "bush"       "democrat"   "november"  "kerry"   
[2,] "presided"       "kerry"     "war"            "democrat"   "republican" "poll"      "bush"    
[3,] "administration" "clark"     "bush"           "poll"       "parties"    "vote"      "poll"    
[4,] "kerry"          "edward"    "american"       "kerry"      "state"      "challenge" "campaign"
[5,] "iraq"           "democrat"  "iraqi"          "republican" "senate"     "bush"      "voter"   
[6,] "state"          "primaries" "administration" "elect"      "race"       "democrat"  "presided"

Which k-means cluster best corresponds to the Iraq War?

  • 3

Which k-means cluster best corresponds to the democratic party? (Remember that we are looking for the names of the key democratic party leaders.)

  • 2
2.3 ~ 2.6 兩種分群結果之間的對應關係

For the rest of this problem, we’ll ask you to compare how observations were assigned to clusters in the two different methods. Use the table function to compare the cluster assignment of hierarchical clustering to the cluster assignment of k-means clustering.

KmeansCluster1 = subset(dailykos, KMC$cluster == 1)
KmeansCluster2 = subset(dailykos, KMC$cluster == 2)
KmeansCluster3 = subset(dailykos, KMC$cluster == 3)
KmeansCluster4 = subset(dailykos, KMC$cluster == 4)
KmeansCluster5 = subset(dailykos, KMC$cluster == 5)
KmeansCluster6 = subset(dailykos, KMC$cluster == 6)
KmeansCluster7 = subset(dailykos, KMC$cluster == 7)
tail(sort(colMeans(KmeansCluster1)))
         state           iraq          kerry administration       presided           bush 
         1.610          1.616          1.637          2.664          2.767         11.432 
tail(sort(colMeans(KmeansCluster2)))
primaries  democrat    edward     clark     kerry      dean 
    2.319     2.694     2.799     3.090     4.979     8.278 
tail(sort(colMeans(KmeansCluster3)))
   iraqi american     bush  cluster      war     iraq 
   1.610    1.686    2.610    3.000    3.025    4.094 
tail(sort(colMeans(KmeansCluster4)))
republican      kerry       poll   democrat       bush    cluster 
    0.6175     0.6495     0.7475     0.7891     1.1474     4.0000 
tail(sort(colMeans(KmeansCluster5)))
    senate      state    parties republican    cluster   democrat 
     2.650      3.521      3.620      4.638      5.000      6.994 
tail(sort(colMeans(KmeansCluster6)))
     bush challenge      vote      poll   cluster  november 
    2.960     4.122     4.447     4.872     6.000    10.371 
tail(sort(colMeans(KmeansCluster7)))
   voter campaign     poll     bush    kerry  cluster 
   1.334    1.383    2.789    5.971    6.481    7.000 
table(hier = hierGroups,KMC = KMC$cluster)
    KMC
hier    1    2    3    4    5    6    7
   1    3   11   64 1045   32    0  111
   2    0    0    0    0    0  320    1
   3   85   10   42   79  126    8   24
   4   10    5    0    0    1    0  123
   5   48    0  171  145    3    1   39
   6    0    2    0  712    0    0    0
   7    0  116    0   82    1    0   10

Which Hierarchical Cluster best corresponds to K-Means Cluster 2?

  • 7

Which Hierarchical Cluster best corresponds to K-Means Cluster 3?

  • 5

Which Hierarchical Cluster best corresponds to K-Means Cluster 7?

  • No Hierarchical Cluster contains at least half of the points in K-Means Cluster 7.

Which Hierarchical Cluster best corresponds to K-Means Cluster 6?

  • 2
【討論問題】

字頻表是什麼?它的資料格式?

  • 字頻表是對一些特定的單字計算在某些文章或是其他出處出現的次數。
  • 在row是資料的出處,可能是某篇文章,或是某天的報章雜誌;在colum是特定的文字,依據想了解的文字進行分析,可多可少。

使用字頻表作集群分析時,區隔變數是什麼?

  • 文字的不同(種類),字的數量是區隔變數的值。
  • 去除stop words後,出現頻率高的文字欄位當作選取的文字。

從樹狀圖判斷群數和從應用需求決定群數有什麼差別?

  • 樹狀圖會依「群內距離小,群間距離大」的準則切割,因此會從樹狀圖的垂直距離大的地方切割,會依數據的不同可能會有不同的切割數。
  • 樹狀圖的最適當切割數會過少,但實際應用切割過少可能導致資料分群不夠精細,因此技術上或應用上會造成不同的取捨。








LS0tDQp0aXRsZTogIkFTNi0xIEdyb3VwNCBEYWlseSBLb3Pmlofnq6DliIbnvqQiDQphdXRob3I6ICLnjovmrKMsIE0wNjQxMTEwMzkiDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQo8YnI+DQoNCioq5Li76KaB6K2w6aGM77ya5L6d5a2X6aC76KGo5bCN5paH56ug5YiG576kKioNCg0KKirlrbjnv5Lph43pu57vvJoqKg0KDQorIOS+neWtl+mgu+ihqOWwjeaWh+eroOWIhue+pA0KKyDlsaTntJrlvI/pm4bnvqTliIbmnpDvvJpIaWVyYXJjaGljYWwgQ2x1c3RlciBBbmFseXNpcw0KKyDkvp3mk5rmqLnni4DlnJbmsbrlrpropoHliIblpJrlsJHnvqQNCisg5L6d5pOa5oeJ55So5rG65a6a6KaB5YiG5aSa5bCR576kDQorIEstTWVhbnPpm4bnvqTliIbmnpDvvJpLLU1lYW5zIENsdXN0ZXIgQW5hbHlzaXMNCisg5b6e5bi46KaL5a2X6L6t5o6o6KuW5paH6ZuG55qE5Li76aGMDQoNCg0KYGBge3IgZWNobz1ULCBtZXNzYWdlPUYsIGNhY2hlPUYsIHdhcm5pbmc9Rn0NCnJtKGxpc3Q9bHMoYWxsPVQpKQ0KU3lzLnNldGxvY2FsZSgiTENfQUxMIiwiQyIpDQpvcHRpb25zKGRpZ2l0cz00LCBzY2lwZW49MTIpDQpsaWJyYXJ5KGRwbHlyKQ0KYGBgDQo8YnI+DQoNCi0gLSAtDQoNCiMjIyAxLiBIaWVyYXJjaGljYWwgQ2x1c3RlcmluZw0KDQojIyMjIyAxLjEg5a2X6aC76KGo44CB6Led6Zui55+p6Zmj44CB6ZqO5bGk5byP6ZuG576k5YiG5p6QDQpMZXQncyBzdGFydCBieSBidWlsZGluZyBhIGhpZXJhcmNoaWNhbCBjbHVzdGVyaW5nIG1vZGVsLiBGaXJzdCwgcmVhZCB0aGUgZGF0YSBzZXQgaW50byBSLiBUaGVuLCBjb21wdXRlIHRoZSBkaXN0YW5jZXMgKHVzaW5nIG1ldGhvZD0iZXVjbGlkZWFuIiksIGFuZCB1c2UgaGNsdXN0IHRvIGJ1aWxkIHRoZSBtb2RlbCAodXNpbmcgbWV0aG9kPSJ3YXJkLkQiKS4gWW91IHNob3VsZCBjbHVzdGVyIG9uIGFsbCBvZiB0aGUgdmFyaWFibGVzLg0KDQpgYGB7cn0NCmRhaWx5a29zID0gcmVhZC5jc3YoImRhdGEvZGFpbHlrb3MuY3N2IikNCmRpc3RhbmNlID0gZGlzdChkYWlseWtvcyxtZXRob2Q9ImV1Y2xpZGVhbiIpDQpjbHVzdGVyZGFpbHlrb3MgPSBoY2x1c3QoZGlzdGFuY2UsbWV0aG9kPSJ3YXJkLkQiKQ0KYGBgDQpfUnVubmluZyB0aGUgZGlzdCBmdW5jdGlvbiB3aWxsIHByb2JhYmx5IHRha2UgeW91IGEgd2hpbGUuIFdoeT9fIFNlbGVjdCBhbGwgdGhhdCBhcHBseS4NCg0KKyBXZSBoYXZlIGEgbG90IG9mIG9ic2VydmF0aW9ucywgc28gaXQgdGFrZXMgYSBsb25nIHRpbWUgdG8gY29tcHV0ZSB0aGUgZGlzdGFuY2UgYmV0d2VlbiBlYWNoIHBhaXIgb2Ygb2JzZXJ2YXRpb25zLg0KKyBXZSBoYXZlIGEgbG90IG9mIHZhcmlhYmxlcywgc28gdGhlIGRpc3RhbmNlIGNvbXB1dGF0aW9uIGlzIGxvbmcuDQoNCg0KUGxvdCB0aGUgZGVuZHJvZ3JhbSBvZiB5b3VyIGhpZXJhcmNoaWNhbCBjbHVzdGVyaW5nIG1vZGVsLiANCmBgYHtyfQ0KcGxvdChjbHVzdGVyZGFpbHlrb3MpDQpyZWN0LmhjbHVzdChjbHVzdGVyZGFpbHlrb3MsIGs9NywgYm9yZGVyPSJyZWQiKQ0KYGBgDQoNCiMjIyMjIDEuMiDlvp7mqLnni4DlnJbliKTmlrfnvqTmlbgNCkp1c3QgbG9va2luZyBhdCB0aGUgZGVuZHJvZ3JhbSwgDQoNCl93aGljaCBvZiB0aGUgZm9sbG93aW5nIHNlZW0gbGlrZSBnb29kIGNob2ljZXMgZm9yIHRoZSBudW1iZXIgb2YgY2x1c3RlcnM/XyBTZWxlY3QgYWxsIHRoYXQgYXBwbHkuDQoNCisgMg0KKyAzDQorIOaJvuWeguebtOi3nemboui8g+Wkp+eahOWBmuWIh+WJsui8g+WQiOmBqeOAgg0KDQojIyMjIyAxLjMg5b6e5oeJ55So5rG65a6a576k5pW4DQpJbiB0aGlzIHByb2JsZW0sIHdlIGFyZSB0cnlpbmcgdG8gY2x1c3RlciBuZXdzIGFydGljbGVzIG9yIGJsb2cgcG9zdHMgaW50byBncm91cHMuIFRoaXMgY2FuIGJlIHVzZWQgdG8gc2hvdyByZWFkZXJzIGNhdGVnb3JpZXMgdG8gY2hvb3NlIGZyb20gd2hlbiB0cnlpbmcgdG8gZGVjaWRlIHdoYXQgdG8gcmVhZC4gSnVzdCB0aGlua2luZyBhYm91dCB0aGlzIGFwcGxpY2F0aW9uLCANCg0KX3doYXQgYXJlIGdvb2QgY2hvaWNlcyBmb3IgdGhlIG51bWJlciBvZiBjbHVzdGVycz9fIFNlbGVjdCBhbGwgdGhhdCBhcHBseS4NCg0KKyA3DQorIDgNCg0KIyMjIyMgMS40IOS+nee+pOe1hOWIhuWJsuizh+aWmQ0KTGV0J3MgcGljayA3IGNsdXN0ZXJzLiBUaGlzIG51bWJlciBpcyByZWFzb25hYmxlIGFjY29yZGluZyB0byB0aGUgZGVuZHJvZ3JhbSwgYW5kIGFsc28gc2VlbXMgcmVhc29uYWJsZSBmb3IgdGhlIGFwcGxpY2F0aW9uLiBVc2UgdGhlIGN1dHJlZSBmdW5jdGlvbiB0byBzcGxpdCB5b3VyIGRhdGEgaW50byA3IGNsdXN0ZXJzLg0KYGBge3J9DQpoaWVyR3JvdXBzID0gY3V0cmVlKGNsdXN0ZXJkYWlseWtvcywgayA9IDcpDQp0YWJsZShoaWVyR3JvdXBzKSAlPiUgc29ydChUKQ0KYGBgDQpOb3csIHdlIGRvbid0IHJlYWxseSB3YW50IHRvIHJ1biB0YXBwbHkgb24gZXZlcnkgc2luZ2xlIHZhcmlhYmxlIHdoZW4gd2UgaGF2ZSBvdmVyIDEsMDAwIGRpZmZlcmVudCB2YXJpYWJsZXMuIExldCdzIGluc3RlYWQgdXNlIHRoZSBzdWJzZXQgZnVuY3Rpb24gdG8gc3Vic2V0IG91ciBkYXRhIGJ5IGNsdXN0ZXIuIENyZWF0ZSA3IG5ldyBkYXRhc2V0cywgZWFjaCBjb250YWluaW5nIHRoZSBvYnNlcnZhdGlvbnMgZnJvbSBvbmUgb2YgdGhlIGNsdXN0ZXJzLg0KDQpfSG93IG1hbnkgb2JzZXJ2YXRpb25zIGFyZSBpbiBjbHVzdGVyIDM/Xw0KYGBge3J9DQp0YWJsZShoaWVyR3JvdXBzKQ0KYGBgDQorIDM3NA0KDQpfV2hpY2ggY2x1c3RlciBoYXMgdGhlIG1vc3Qgb2JzZXJ2YXRpb25zP18NCg0KKyAxDQoNCl9XaGljaCBjbHVzdGVyIGhhcyB0aGUgZmV3ZXN0IG9ic2VydmF0aW9ucz9fDQoNCisgNA0KDQojIyMjIyAxLjUg5om+5Ye656ys5LiA5peP576k5Lit5pyA5bi46KaL55qE5a2X6L6tDQpJbnN0ZWFkIG9mIGxvb2tpbmcgYXQgdGhlIGF2ZXJhZ2UgdmFsdWUgaW4gZWFjaCB2YXJpYWJsZSBpbmRpdmlkdWFsbHksIHdlJ2xsIGp1c3QgbG9vayBhdCB0aGUgdG9wIDYgd29yZHMgaW4gZWFjaCBjbHVzdGVyLiBUbyBkbyB0aGlzIGZvciBjbHVzdGVyIDEsIHR5cGUgdGhlIGZvbGxvd2luZyBpbiB5b3VyIFIgY29uc29sZSAod2hlcmUgIkhpZXJDbHVzdGVyMSIgc2hvdWxkIGJlIHJlcGxhY2VkIHdpdGggdGhlIG5hbWUgb2YgeW91ciBmaXJzdCBjbHVzdGVyIHN1YnNldCk6DQoNCnRhaWwoc29ydChjb2xNZWFucyhIaWVyQ2x1c3RlcjEpKSkNCg0KVGhpcyBjb21wdXRlcyB0aGUgbWVhbiBmcmVxdWVuY3kgdmFsdWVzIG9mIGVhY2ggb2YgdGhlIHdvcmRzIGluIGNsdXN0ZXIgMSwgYW5kIHRoZW4gb3V0cHV0cyB0aGUgNiB3b3JkcyB0aGF0IG9jY3VyIHRoZSBtb3N0IGZyZXF1ZW50bHkuIFRoZSBjb2xNZWFucyBmdW5jdGlvbiBjb21wdXRlcyB0aGUgY29sdW1uICh3b3JkKSBtZWFucywgdGhlIHNvcnQgZnVuY3Rpb24gb3JkZXJzIHRoZSB3b3JkcyBpbiBpbmNyZWFzaW5nIG9yZGVyIG9mIHRoZSBtZWFuIHZhbHVlcywgYW5kIHRoZSB0YWlsIGZ1bmN0aW9uIG91dHB1dHMgdGhlIGxhc3QgNiB3b3JkcyBsaXN0ZWQsIHdoaWNoIGFyZSB0aGUgb25lcyB3aXRoIHRoZSBsYXJnZXN0IGNvbHVtbiBtZWFucy4NCg0KX1doYXQgaXMgdGhlIG1vc3QgZnJlcXVlbnQgd29yZCBpbiB0aGlzIGNsdXN0ZXIsIGluIHRlcm1zIG9mIGF2ZXJhZ2UgdmFsdWU/XyBFbnRlciB0aGUgd29yZCBleGFjdGx5IGhvdyB5b3Ugc2VlIGl0IGluIHRoZSBvdXRwdXQ6DQpgYGB7cn0NCkhpZXJDbHVzdGVyMSA9IHN1YnNldChkYWlseWtvcywgaGllckdyb3VwcyA9PSAxKQ0KDQpIaWVyQ2x1c3RlcjIgPSBzdWJzZXQoZGFpbHlrb3MsIGhpZXJHcm91cHMgPT0gMikNCg0KSGllckNsdXN0ZXIzID0gc3Vic2V0KGRhaWx5a29zLCBoaWVyR3JvdXBzID09IDMpDQoNCkhpZXJDbHVzdGVyNCA9IHN1YnNldChkYWlseWtvcywgaGllckdyb3VwcyA9PSA0KQ0KDQpIaWVyQ2x1c3RlcjUgPSBzdWJzZXQoZGFpbHlrb3MsIGhpZXJHcm91cHMgPT0gNSkNCg0KSGllckNsdXN0ZXI2ID0gc3Vic2V0KGRhaWx5a29zLCBoaWVyR3JvdXBzID09IDYpDQoNCkhpZXJDbHVzdGVyNyA9IHN1YnNldChkYWlseWtvcywgaGllckdyb3VwcyA9PSA3KQ0KDQp0YWlsKHNvcnQoY29sTWVhbnMoSGllckNsdXN0ZXIxKSxGKSkNCmBgYA0KKyBidXNoDQoNCiMjIyMjIDEuNiDmib7lh7rlkITml4/nvqTkuK3mnIDluLjopovnmoTlrZfovq0NCk5vdyByZXBlYXQgdGhlIGNvbW1hbmQgZ2l2ZW4gaW4gdGhlIHByZXZpb3VzIHByb2JsZW0gZm9yIGVhY2ggb2YgdGhlIG90aGVyIGNsdXN0ZXJzLCBhbmQgYW5zd2VyIHRoZSBmb2xsb3dpbmcgcXVlc3Rpb25zLg0KYGBge3J9DQp0YWlsKHNvcnQoY29sTWVhbnMoSGllckNsdXN0ZXIyKSxGKSkNCnRhcHBseShkYWlseWtvcyRpcmFxICwgaGllckdyb3VwcywgbWVhbikgJT4lIHNvcnQoVCkNCmBgYA0KDQpfV2hpY2ggd29yZHMgYmVzdCBkZXNjcmliZSBjbHVzdGVyIDI/Xw0KDQorIG5vdmVtYmVyLCBwb2xsLCB2b3RlICxjaGFsbGVuZ2UNCg0KDQpfV2hpY2ggY2x1c3RlciBjb3VsZCBiZXN0IGJlIGRlc2NyaWJlZCBhcyB0aGUgY2x1c3RlciByZWxhdGVkIHRvIHRoZSBJcmFxIHdhcj9fDQoNCisgNQ0KDQpJbiAyMDA0LCBvbmUgb2YgdGhlIGNhbmRpZGF0ZXMgZm9yIHRoZSBEZW1vY3JhdGljIG5vbWluYXRpb24gZm9yIHRoZSBQcmVzaWRlbnQgb2YgdGhlIFVuaXRlZCBTdGF0ZXMgd2FzIEhvd2FyZCBEZWFuLCBKb2huIEtlcnJ5IHdhcyB0aGUgY2FuZGlkYXRlIHdobyB3b24gdGhlIGRlbW9jcmF0aWMgbm9taW5hdGlvbiwgYW5kIEpvaG4gRWR3YXJkcyB3aXRoIHRoZSBydW5uaW5nIG1hdGUgb2YgSm9obiBLZXJyeSAodGhlIFZpY2UgUHJlc2lkZW50IG5vbWluZWUpLiBHaXZlbiB0aGlzIGluZm9ybWF0aW9uLCANCg0KYGBge3J9DQp0YWlsKHNvcnQoY29sTWVhbnMoSGllckNsdXN0ZXIyKSkpDQoNCnRhaWwoc29ydChjb2xNZWFucyhIaWVyQ2x1c3RlcjMpKSkNCg0KdGFpbChzb3J0KGNvbE1lYW5zKEhpZXJDbHVzdGVyNCkpKQ0KDQp0YWlsKHNvcnQoY29sTWVhbnMoSGllckNsdXN0ZXI1KSkpDQoNCnRhaWwoc29ydChjb2xNZWFucyhIaWVyQ2x1c3RlcjYpKSkNCg0KdGFpbChzb3J0KGNvbE1lYW5zKEhpZXJDbHVzdGVyNykpKQ0KYGBgDQoNCl93aGljaCBjbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gdGhlIGRlbW9jcmF0aWMgcGFydHk/Xw0KDQorIDcNCg0KPGJyPg0KDQotIC0gLQ0KDQojIyMgMiBLLU1lYW5zIENsdXN0ZXJpbmcNCg0KIyMjIyMgMi4xIEstTWVhbnPpm4bnvqTliIbmnpANCk5vdywgcnVuIGstbWVhbnMgY2x1c3RlcmluZywgc2V0dGluZyB0aGUgc2VlZCB0byAxMDAwIHJpZ2h0IGJlZm9yZSB5b3UgcnVuIHRoZSBrbWVhbnMgZnVuY3Rpb24uIEFnYWluLCBwaWNrIHRoZSBudW1iZXIgb2YgY2x1c3RlcnMgZXF1YWwgdG8gNy4gWW91IGRvbid0IG5lZWQgdG8gYWRkIHRoZSBpdGVycy5tYXggYXJndW1lbnQuDQpgYGB7cn0NCnNldC5zZWVkKDEwMDApDQpLTUMgPSBrbWVhbnMoZGFpbHlrb3MsIGNlbnRlcnMgPSA3KQ0KdGFibGUoS01DJGNsdXN0ZXIpICU+JSBzb3J0KFQpDQpgYGANCg0KU3Vic2V0IHlvdXIgZGF0YSBpbnRvIHRoZSA3IGNsdXN0ZXJzICg3IG5ldyBkYXRhc2V0cykgYnkgdXNpbmcgdGhlICJjbHVzdGVyIiB2YXJpYWJsZSBvZiB5b3VyIGttZWFucyBvdXRwdXQuDQoNCl9Ib3cgbWFueSBvYnNlcnZhdGlvbnMgYXJlIGluIENsdXN0ZXIgMz9fDQoNCisgMjc3DQoNCl9XaGljaCBjbHVzdGVyIGhhcyB0aGUgbW9zdCBvYnNlcnZhdGlvbnM/Xw0KDQorIDQNCg0KX1doaWNoIGNsdXN0ZXIgaGFzIHRoZSBmZXdlc3QgbnVtYmVyIG9mIG9ic2VydmF0aW9ucz9fDQoNCisgMg0KDQojIyMjIyAyLjIg5om+5Ye65ZCE5peP576k5Lit5pyA5bi46KaL55qE5a2X6L6tDQpOb3csIG91dHB1dCB0aGUgc2l4IG1vc3QgZnJlcXVlbnQgd29yZHMgaW4gZWFjaCBjbHVzdGVyLCBsaWtlIHdlIGRpZCBpbiB0aGUgcHJldmlvdXMgcHJvYmxlbSwgZm9yIGVhY2ggb2YgdGhlIGstbWVhbnMgY2x1c3RlcnMuDQpgYGB7cn0NCmRhaWx5a29zJGNsdXN0ZXIgPSBLTUMkY2x1c3Rlcg0KZGFpbHlrb3MgPSBhcy5kYXRhLmZyYW1lKGRhaWx5a29zKQ0Kc2FwcGx5KHNwbGl0KGRhaWx5a29zWywxOjE1NDVdLCBkYWlseWtvcyRjbHVzdGVyKSxmdW5jdGlvbih4KSBuYW1lcyhoZWFkKHNvcnQoY29sTWVhbnMoeCksVCkpKQ0KICApDQoNCmBgYA0KDQpfV2hpY2ggay1tZWFucyBjbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gdGhlIElyYXEgV2FyP18NCg0KKyAzDQoNCl9XaGljaCBrLW1lYW5zIGNsdXN0ZXIgYmVzdCBjb3JyZXNwb25kcyB0byB0aGUgZGVtb2NyYXRpYyBwYXJ0eT9fIChSZW1lbWJlciB0aGF0IHdlIGFyZSBsb29raW5nIGZvciB0aGUgbmFtZXMgb2YgdGhlIGtleSBkZW1vY3JhdGljIHBhcnR5IGxlYWRlcnMuKQ0KDQorIDINCg0KIyMjIyMgMi4zIH4gMi42IOWFqeeoruWIhue+pOe1kOaenOS5i+mWk+eahOWwjeaHiemXnOS/gg0KRm9yIHRoZSByZXN0IG9mIHRoaXMgcHJvYmxlbSwgd2UnbGwgYXNrIHlvdSB0byBjb21wYXJlIGhvdyBvYnNlcnZhdGlvbnMgd2VyZSBhc3NpZ25lZCB0byBjbHVzdGVycyBpbiB0aGUgdHdvIGRpZmZlcmVudCBtZXRob2RzLiBVc2UgdGhlIHRhYmxlIGZ1bmN0aW9uIHRvIGNvbXBhcmUgdGhlIGNsdXN0ZXIgYXNzaWdubWVudCBvZiBoaWVyYXJjaGljYWwgY2x1c3RlcmluZyB0byB0aGUgY2x1c3RlciBhc3NpZ25tZW50IG9mIGstbWVhbnMgY2x1c3RlcmluZy4NCmBgYHtyfQ0KS21lYW5zQ2x1c3RlcjEgPSBzdWJzZXQoZGFpbHlrb3MsIEtNQyRjbHVzdGVyID09IDEpDQoNCkttZWFuc0NsdXN0ZXIyID0gc3Vic2V0KGRhaWx5a29zLCBLTUMkY2x1c3RlciA9PSAyKQ0KDQpLbWVhbnNDbHVzdGVyMyA9IHN1YnNldChkYWlseWtvcywgS01DJGNsdXN0ZXIgPT0gMykNCg0KS21lYW5zQ2x1c3RlcjQgPSBzdWJzZXQoZGFpbHlrb3MsIEtNQyRjbHVzdGVyID09IDQpDQoNCkttZWFuc0NsdXN0ZXI1ID0gc3Vic2V0KGRhaWx5a29zLCBLTUMkY2x1c3RlciA9PSA1KQ0KDQpLbWVhbnNDbHVzdGVyNiA9IHN1YnNldChkYWlseWtvcywgS01DJGNsdXN0ZXIgPT0gNikNCg0KS21lYW5zQ2x1c3RlcjcgPSBzdWJzZXQoZGFpbHlrb3MsIEtNQyRjbHVzdGVyID09IDcpDQoNCnRhaWwoc29ydChjb2xNZWFucyhLbWVhbnNDbHVzdGVyMSkpKQ0KDQp0YWlsKHNvcnQoY29sTWVhbnMoS21lYW5zQ2x1c3RlcjIpKSkNCg0KdGFpbChzb3J0KGNvbE1lYW5zKEttZWFuc0NsdXN0ZXIzKSkpDQoNCnRhaWwoc29ydChjb2xNZWFucyhLbWVhbnNDbHVzdGVyNCkpKQ0KDQp0YWlsKHNvcnQoY29sTWVhbnMoS21lYW5zQ2x1c3RlcjUpKSkNCg0KdGFpbChzb3J0KGNvbE1lYW5zKEttZWFuc0NsdXN0ZXI2KSkpDQoNCnRhaWwoc29ydChjb2xNZWFucyhLbWVhbnNDbHVzdGVyNykpKQ0KYGBgDQoNCmBgYHtyfQ0KdGFibGUoaGllciA9IGhpZXJHcm91cHMsS01DID0gS01DJGNsdXN0ZXIpDQoNCmBgYA0KX1doaWNoIEhpZXJhcmNoaWNhbCBDbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gSy1NZWFucyBDbHVzdGVyIDI/Xw0KDQorIDcNCg0KX1doaWNoIEhpZXJhcmNoaWNhbCBDbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gSy1NZWFucyBDbHVzdGVyIDM/Xw0KDQorIDUNCg0KX1doaWNoIEhpZXJhcmNoaWNhbCBDbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gSy1NZWFucyBDbHVzdGVyIDc/Xw0KDQorIE5vIEhpZXJhcmNoaWNhbCBDbHVzdGVyIGNvbnRhaW5zIGF0IGxlYXN0IGhhbGYgb2YgdGhlIHBvaW50cyBpbiBLLU1lYW5zIENsdXN0ZXIgNy4NCg0KX1doaWNoIEhpZXJhcmNoaWNhbCBDbHVzdGVyIGJlc3QgY29ycmVzcG9uZHMgdG8gSy1NZWFucyBDbHVzdGVyIDY/Xw0KDQorIDINCg0KDQojIyMjIyDjgJDoqI7oq5bllY/poYzjgJENCg0K5a2X6aC76KGo5piv5LuA6bq877yf5a6D55qE6LOH5paZ5qC85byP77yfDQoNCisg5a2X6aC76KGo5piv5bCN5LiA5Lqb54m55a6a55qE5Zau5a2X6KiI566X5Zyo5p+Q5Lqb5paH56ug5oiW5piv5YW25LuW5Ye66JmV5Ye654++55qE5qyh5pW444CCDQorIOWcqHJvd+aYr+izh+aWmeeahOWHuuiZle+8jOWPr+iDveaYr+afkOevh+aWh+eroO+8jOaIluaYr+afkOWkqeeahOWgseeroOmbnOiqjO+8m+WcqGNvbHVt5piv54m55a6a55qE5paH5a2X77yM5L6d5pOa5oOz5LqG6Kej55qE5paH5a2X6YCy6KGM5YiG5p6Q77yM5Y+v5aSa5Y+v5bCR44CCDQoNCuS9v+eUqOWtl+mgu+ihqOS9nOmbhue+pOWIhuaekOaZgu+8jOWNgOmalOiuiuaVuOaYr+S7gOm6vO+8nw0KDQorIOaWh+Wtl+eahOS4jeWQjCjnqK7poZ4p77yM5a2X55qE5pW46YeP5piv5Y2A6ZqU6K6K5pW455qE5YC844CCDQorIOWOu+mZpHN0b3Agd29yZHPlvozvvIzlh7rnj77poLvnjofpq5jnmoTmloflrZfmrITkvY3nlbbkvZzpgbjlj5bnmoTmloflrZfjgIINCg0K5b6e5qi554uA5ZyW5Yik5pa3576k5pW45ZKM5b6e5oeJ55So6ZyA5rGC5rG65a6a576k5pW45pyJ5LuA6bq85beu5Yil77yfDQoNCisg5qi554uA5ZyW5pyD5L6d44CM576k5YWn6Led6Zui5bCP77yM576k6ZaT6Led6Zui5aSn44CN55qE5rqW5YmH5YiH5Ymy77yM5Zug5q2k5pyD5b6e5qi554uA5ZyW55qE5Z6C55u06Led6Zui5aSn55qE5Zyw5pa55YiH5Ymy77yM5pyD5L6d5pW45pOa55qE5LiN5ZCM5Y+v6IO95pyD5pyJ5LiN5ZCM55qE5YiH5Ymy5pW444CCDQorIOaoueeLgOWclueahOacgOmBqeeVtuWIh+WJsuaVuOacg+mBjuWwke+8jOS9huWvpumam+aHieeUqOWIh+WJsumBjuWwkeWPr+iDveWwjuiHtOizh+aWmeWIhue+pOS4jeWkoOeyvue0sO+8jOWboOatpOaKgOihk+S4iuaIluaHieeUqOS4iuacg+mAoOaIkOS4jeWQjOeahOWPluaNqOOAgg0KDQoNCg0KPGJyPg0KDQotIC0gLQ0KDQo8YnI+PGJyPjxicj48YnI+PGJyPg0KDQo8c3R5bGU+DQouY2FwdGlvbiB7DQogIGNvbG9yOiAjNzc3Ow0KICBtYXJnaW4tdG9wOiAxMHB4Ow0KfQ0KcCBjb2RlIHsNCiAgd2hpdGUtc3BhY2U6IGluaGVyaXQ7DQp9DQpwcmUgew0KICB3b3JkLWJyZWFrOiBub3JtYWw7DQogIHdvcmQtd3JhcDogbm9ybWFsOw0KICBsaW5lLWhlaWdodDogMTsNCn0NCnByZSBjb2RlIHsNCiAgd2hpdGUtc3BhY2U6IGluaGVyaXQ7DQp9DQpwLGxpIHsNCiAgZm9udC1mYW1pbHk6ICJUcmVidWNoZXQgTVMiLCAi5b6u6Luf5q2j6buR6auUIiwgIk1pY3Jvc29mdCBKaGVuZ0hlaSI7DQp9DQoNCi5yew0KICBsaW5lLWhlaWdodDogMS4yOw0KfQ0KDQp0aXRsZXsNCiAgY29sb3I6ICNjYzAwMDA7DQogIGZvbnQtZmFtaWx5OiAiVHJlYnVjaGV0IE1TIiwgIuW+rui7n+ato+m7kemrlCIsICJNaWNyb3NvZnQgSmhlbmdIZWkiOw0KfQ0KDQpib2R5ew0KICBmb250LWZhbWlseTogIlRyZWJ1Y2hldCBNUyIsICLlvq7ou5/mraPpu5Hpq5QiLCAiTWljcm9zb2Z0IEpoZW5nSGVpIjsNCn0NCg0KaDEsaDIsaDMsaDQsaDV7DQogIGNvbG9yOiAjMDA4ODAwOw0KICBmb250LWZhbWlseTogIlRyZWJ1Y2hldCBNUyIsICLlvq7ou5/mraPpu5Hpq5QiLCAiTWljcm9zb2Z0IEpoZW5nSGVpIjsNCn0NCg0KaDN7DQogIGNvbG9yOiAjYjM2YjAwOw0KICBiYWNrZ3JvdW5kOiAjZmZlMGIzOw0KICBsaW5lLWhlaWdodDogMjsNCiAgZm9udC13ZWlnaHQ6IGJvbGQ7DQp9DQoNCmg1ew0KICBjb2xvcjogIzAwNjAwMDsNCiAgYmFja2dyb3VuZDogI2ZmZmZlMDsNCiAgbGluZS1oZWlnaHQ6IDI7DQogIGZvbnQtd2VpZ2h0OiBib2xkOw0KfQ0KDQplbXsNCiAgY29sb3I6ICMwMDAwYzA7DQogIGJhY2tncm91bmQ6ICNmMGYwZjA7DQogIH0NCg0KPC9zdHlsZT4NCg0K