Post-Module

Necessay package
1. Data
2. Get familar with the data
3. Weigted Edges
4. PageRank
5. Community Detection
6.

Necessay package

1. Data

This dataset contains edges of the talent flow graph. from is the id of the originating firm. to is the id of destination firm. migration_count is the number of workers who have migrated from the from firm to the to firm up until 2018.

df_talent_flows <- read.csv("C://Users//dashv//OneDrive//Desktop//mydata//talent_flows.csv")

head(df_talent_flows)

This dataset contains the metadata for each firm. emp_count is the total number of employees LinkedIn lists for the firm. All the other columns are self-explanatory.

df_company <- read.csv("C://Users//dashv//OneDrive//Desktop//mydata//linkedin_company_metadata.csv")

head(df_company)

summary(df_company)

##   company_id            name             industry             city          
##  Length:473         Length:473         Length:473         Length:473        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    country             founded          hq              overview        
##  Length:473         Min.   :1784   Length:473         Length:473        
##  Class :character   1st Qu.:1911   Class :character   Class :character  
##  Mode  :character   Median :1962   Mode  :character   Mode  :character  
##                     Mean   :1945                                        
##                     3rd Qu.:1985                                        
##                     Max.   :2015                                        
##                     NA's   :116                                         
##    emp_count     
##  Min.   :   150  
##  1st Qu.:  5738  
##  Median : 13767  
##  Mean   : 36526  
##  3rd Qu.: 33361  
##  Max.   :771986  
##

# create a dataframe called df_edges, where the first two columns are from and to. this will make it easier to work with igraph
df_edges <- df_talent_flows

head(df_edges)

2. Get familar with the data

2.1 Create an igraph graph object using the df_edges dataframe

graph_df_talent_flows<- graph_from_data_frame(df_edges, directed = TRUE)
graph_df_talent_flows

## IGRAPH 4b2c398 DN-- 473 81114 -- 
## + attr: name (v/c), migration_count (e/n)
## + edges from 4b2c398 (vertex names):
##  [1] at&t                ->oracle               
##  [2] colgate-palmolive   ->nike                 
##  [3] agilent-technologies->stryker              
##  [4] ebay                ->expedia              
##  [5] comcast             ->republic-services-inc
##  [6] aon                 ->aig                  
##  [7] costco-wholesale    ->apple                
##  [8] facebook            ->cisco                
## + ... omitted several edges

# How many nodes?
vcount(graph_df_talent_flows)

## [1] 473

# How many edges?
ecount(graph_df_talent_flows)

## [1] 81114

# Get nodes
V(graph_df_talent_flows)

## + 473/473 vertices, named, from 4b2c398:
##   [1] at&t                                 colgate-palmolive                   
##   [3] agilent-technologies                 ebay                                
##   [5] comcast                              aon                                 
##   [7] costco-wholesale                     facebook                            
##   [9] john-deere                           ross-stores                         
##  [11] american-express                     target                              
##  [13] cme-group                            jpmorgan-chase                      
##  [15] united-airlines                      the-home-depot                      
##  [17] xerox                                wellsfargo                          
##  [19] boeing                               jefferies                           
## + ... omitted several vertices

# Get edges
E(graph_df_talent_flows)

## + 81114/81114 edges from 4b2c398 (vertex names):
##  [1] at&t                ->oracle               
##  [2] colgate-palmolive   ->nike                 
##  [3] agilent-technologies->stryker              
##  [4] ebay                ->expedia              
##  [5] comcast             ->republic-services-inc
##  [6] aon                 ->aig                  
##  [7] costco-wholesale    ->apple                
##  [8] facebook            ->cisco                
##  [9] john-deere          ->ge                   
## [10] ross-stores         ->walmart              
## + ... omitted several edges

2.2 Calculate the in-degree and out-degree for each firm. What are the top 10 firms with the highest in-degree? What are the 10 firms with the highest out-degree? Describe in your own words what these metrics mean.

# Get a list of all possible company by looking at the
# 'from' and 'to' columns
company <- unique(c(df_edges[, 1], df_edges[, 2]))

# Count how many inbound links all company have
in_count <- table(factor(df_edges[, 2], levels = company))


# Turn these counted objects into dataframes
in_count <- data.frame(in_count)
colnames(in_count) <- c("company", "freq")

df_in_count <- data.frame(in_count)

titles <- read.csv("C://Users//dashv//OneDrive//Desktop//mydata//linkedin_company_metadata.csv")
join_df_in_count <-left_join(df_in_count, titles, by = c(company = "company_id"))
top_df_in_count <- join_df_in_count %>%
                    arrange(desc(freq)) %>%
                    head(10)
join_df_in_count

print("top 10 firms with highest in-degree")

## [1] "top 10 firms with highest in-degree"

top_df_in_count$name

##  [1] "IBM"                        "Accenture"                 
##  [3] "Hewlett Packard Enterprise" "AT&T"                      
##  [5] "Bank of America"            "Amazon"                    
##  [7] "Wells Fargo"                "JPMorgan Chase & Co."      
##  [9] "Microsoft"                  "Citi"

#---------------------------------------------------------------------------------------------------------------------

# Count how many outbound links all company have
out_count <- table(factor(df_edges[, 1], levels = company))


# Turn these counted objects into dataframes
out_count <- data.frame(out_count)
colnames(out_count) <- c("company", "freq")


df_out_count <- data.frame(out_count)
join_df_out_count <-left_join(df_out_count, titles, by = c(company = "company_id"))
top_df_out_count <- join_df_out_count %>%
                    arrange(desc(freq)) %>%
                    head(10)

print("top 10 firms with highest out-degree")

## [1] "top 10 firms with highest out-degree"

top_df_out_count$name

##  [1] "IBM"                        "AT&T"                      
##  [3] "Hewlett Packard Enterprise" "JPMorgan Chase & Co."      
##  [5] "Bank of America"            "Accenture"                 
##  [7] "GE"                         "Wells Fargo"               
##  [9] "Citi"                       "Target"

A: From the result, IBM has the most # of talents in/out of the company which means that it has the highest rotation rate among others.

2.3 You’ll notice that the firms with the highest degree are biased towards larger firms. Explain why we might expect this type of correlation?

Answer: Larger companies tend to have higher reputation and attract more talents from the markets. Also, talents from larger companies will have more opportunities in the job markets. Both reasons cause large companies to have highest degree.

2.4. Statistically test if this is indeed the case (i.e. whether in-degree and out-degree are correlated with firm size) by running two separate linear regressions to estimate the effect of employee count (emp_count) on in-degree and out-degree. In the first model, the dependent variable is in-degree, and the independent variable is employee count. In the second model, the dependent variable is out-degree, and the independent variable is employee count.

# dataset for in-degree linear regression
reg_join_df_in_count <- join_df_in_count %>%
  select(freq,emp_count) 

reg_join_df_in_count

# dataset for out-degree linear regression
reg_join_df_out_count <- join_df_out_count %>%
  select(freq,emp_count) 


# linear model for in-degree
reg_in_deg = lm(reg_join_df_in_count$freq~reg_join_df_in_count$emp_count,
               data = reg_join_df_in_count)
summary(reg_in_deg)

## 
## Call:
## lm(formula = reg_join_df_in_count$freq ~ reg_join_df_in_count$emp_count, 
##     data = reg_join_df_in_count)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -456.46  -42.82   -3.25   47.08  143.81 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    1.366e+02  3.407e+00   40.11   <2e-16 ***
## reg_join_df_in_count$emp_count 9.545e-04  4.365e-05   21.87   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 65.47 on 471 degrees of freedom
## Multiple R-squared:  0.5038, Adjusted R-squared:  0.5027 
## F-statistic: 478.2 on 1 and 471 DF,  p-value: < 2.2e-16

# linear model for out-degree
reg_out_deg = lm(reg_join_df_out_count$freq~reg_join_df_out_count$emp_count,
               data = reg_join_df_out_count)

summary(reg_out_deg)

## 
## Call:
## lm(formula = reg_join_df_out_count$freq ~ reg_join_df_out_count$emp_count, 
##     data = reg_join_df_out_count)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -507.64  -49.69   -2.16   56.89  134.65 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                     1.329e+02  3.684e+00   36.08   <2e-16 ***
## reg_join_df_out_count$emp_count 1.055e-03  4.721e-05   22.36   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 70.81 on 471 degrees of freedom
## Multiple R-squared:  0.5148, Adjusted R-squared:  0.5138 
## F-statistic: 499.8 on 1 and 471 DF,  p-value: < 2.2e-16

2.5. Report and interpret the results of the regression (What do the coefficients mean? What do the p-values mean?)

Summary Both model have statistically significant result on coefficient. For every one unit change in employee count of the company, there in-degree will increase by 9.545e-04 units. While, for every one unit change in employee count of the company, the out-degree will increase by 1.055e-03 units.

3. Weigted Edges

3.1. Calculate the weight for each edge, and add this as a new column called weight to the existing df_edges dataframe.

The weight is migration count (migration_count) divided by the total number of employees (emp_count) of the starting firm.

# Add the weight column to df_edges 

df_edges <- df_edges %>%
  inner_join(df_company, by = c("from" = "company_id")) %>%
  select(from, to, migration_count, emp_count) %>%
  mutate(weight = migration_count/emp_count)

3.2. Create a directed graph (with weighted edges) using the df_edges dataframe.

# select the "from", "to", and "weight" columns
df_weighted_edges <- df_edges %>%
  select(from, to, weight)

# create the igraph
graph_df_talent_flows_weighted <- graph_from_data_frame(df_weighted_edges,directed = TRUE)

# view vertex
V(graph_df_talent_flows_weighted)

## + 473/473 vertices, named, from 4ba0ad3:
##   [1] at&t                                 colgate-palmolive                   
##   [3] agilent-technologies                 ebay                                
##   [5] comcast                              aon                                 
##   [7] costco-wholesale                     facebook                            
##   [9] john-deere                           ross-stores                         
##  [11] american-express                     target                              
##  [13] cme-group                            jpmorgan-chase                      
##  [15] united-airlines                      the-home-depot                      
##  [17] xerox                                wellsfargo                          
##  [19] boeing                               jefferies                           
## + ... omitted several vertices

# view edges 
E(graph_df_talent_flows_weighted)

## + 81114/81114 edges from 4ba0ad3 (vertex names):
##  [1] at&t                ->oracle               
##  [2] colgate-palmolive   ->nike                 
##  [3] agilent-technologies->stryker              
##  [4] ebay                ->expedia              
##  [5] comcast             ->republic-services-inc
##  [6] aon                 ->aig                  
##  [7] costco-wholesale    ->apple                
##  [8] facebook            ->cisco                
##  [9] john-deere          ->ge                   
## [10] ross-stores         ->walmart              
## + ... omitted several edges

3.3. Take the top 10 edges with the largest weight in the graph (and the associated nodes), and visualize it by plotting the graph.

# find the top 10 edges by weight and make it as a new data frame 
df_weighted_edges_top10 <- df_weighted_edges %>%
  arrange(desc(weight))

df_weighted_edges_top10 <- as.data.frame(head(df_weighted_edges_top10, n=10))     

print(df_weighted_edges_top10)

##                     from                         to     weight
## 1                     hp hewlett-packard-enterprise 0.13910068
## 2                abbott-                     abbvie 0.05724070
## 3            allegion-us             ingersoll-rand 0.05494505
## 4               verisign                   symantec 0.05486244
## 5   agilent-technologies      keysight-technologies 0.04563948
## 6         conocophillips               phillips66co 0.03947543
## 7       juniper-networks                      cisco 0.03780439
## 8                   ebay                     paypal 0.03577366
## 9           weyerhaeuser        international-paper 0.03391256
## 10 host-hotels-&-resorts     marriott-international 0.03214286

# create igraph for the top10 weighted edges 
graph_df_talent_flows_weighted_top10 <- graph_from_data_frame(df_weighted_edges_top10, directed=TRUE)

# plot graph
plot.igraph(graph_df_talent_flows_weighted_top10,
            layout = layout.kamada.kawai,
            vertex.color = "plum2",
            vertex.size = 12,
            vertex.label.color = "black",
            vertex.label.font = 2,
            vertex.label.cex = .7,
            edge.color = "slategrey",
            edge.curved = .1,
            edge.arrow.size = .6,
            edge.width = 1,
            edge.label = round(E(graph_df_talent_flows_weighted_top10)$weight,2),
            edge.label.font = 2,
            edge.label.cex = .8,
            edge.label.color = "blue",
            asp = -.2,
            margin = -.01,
            main = "The top 10 edges with the largest weight")

3.4. Interpret your graph. What does it mean if an edge has a high weight? Do you notice anything interesting in the graph?

Answer: The higher the weight, the larger the number of employees from company A jumping to company B. Looking at the graph above, the highest weight 0.14 belongs to “hp –> hp-enterprise”, which indicates that there is a high movement between the subsidiaries of the same Hewlett Packard group. Talent flow within the same group or between affiliated companies is more frequent.Ebay and Paypal is another interesting example, as Ebay had acquired Paypal in 2015.It is also interesting to note that majority of talent flows in and out within the same sector, symantec and verisign in cybersecurity, keysight and agilent in electronics sector, marriot and host-hotel & resorts etc.

4. PageRank

4.1 Explain the intuition behind PageRank using the random surfer model. How is the random surfer model different for graphs with weighted edges?

Answer: PageRank is a metric to rank nodes within a network based on their importance. The random surfer model was essentially first applied to the web network. It assumes that the “random surfer” is someone who starts on a random webpage(node in this case), and then keeps clicking through the links within the first page and subsequent webpages for an infinitely long time. As the surfer does this for an infinitely long time, it will end up spending most amount of time on a webpage that has the highest number of in-bound links, thus highlighting the importance/centrality of the webpage. Similarly, when you extend this concept to a network, PageRank can be used to rank nodes within a network based on how frequently a surfer randomly traversing the network is expected to end up at the most important/central nodes.

For a network with weighted edges, the probability of a random surfer clicking through from a webpage to another is governed by the weight of the edge. An edge with a higher weight has a higher probability of getting clicked through. In case of unweighted edges, the probability is distributed evenly across all edges.

4.2 Calculate the weighted PageRank and the unweighted PageRank for all the nodes in the graph. What are the 10 nodes with the highest unweighted PageRank? What are the 10 nodes with the highest weighted PageRank?

# find the top 10 edges by weight and make it as a new data frame 
df_pagerank_unweighted <- as.data.frame(page_rank(graph_df_talent_flows,vids = V(graph_df_talent_flows),directed = TRUE,weights = NULL)$vector)

colnames(df_pagerank_unweighted) <- c("Page_Rank")

df_pagerank_unweighted_top10 <- df_pagerank_unweighted %>%
  arrange(desc(Page_Rank))

df_pagerank_unweighted_top10 <- as.data.frame(head(df_pagerank_unweighted_top10, n=10))     

print("top 10 highest ranked companies based on pagerank(unweighted)")

## [1] "top 10 highest ranked companies based on pagerank(unweighted)"

print(df_pagerank_unweighted_top10)

##                              Page_Rank
## wellsfargo                 0.005052306
## ibm                        0.005042449
## accenture                  0.004941054
## bank-of-america            0.004872047
## at&t                       0.004823967
## hewlett-packard-enterprise 0.004822008
## amazon                     0.004773077
## jpmorgan-chase             0.004701858
## microsoft                  0.004629907
## citi                       0.004615667

df_pagerank_weighted <- as.data.frame(page_rank(graph_df_talent_flows_weighted,vids = V(graph_df_talent_flows_weighted),directed = TRUE,weights = graph_df_talent_flows_weighted$weight)$vector)

colnames(df_pagerank_weighted) <- c("Page_Rank")

df_pagerank_weighted_top10 <- df_pagerank_weighted %>%
  arrange(desc(Page_Rank))

df_pagerank_weighted_top10 <- as.data.frame(head(df_pagerank_weighted_top10, n=10))     

print("top 10 highest ranked companies based on pagerank(weighted)")

## [1] "top 10 highest ranked companies based on pagerank(weighted)"

print(df_pagerank_weighted_top10)

##                             Page_Rank
## ibm                        0.02584084
## microsoft                  0.02254956
## hewlett-packard-enterprise 0.02095859
## wellsfargo                 0.02031841
## bank-of-america            0.01918484
## jpmorgan-chase             0.01867777
## citi                       0.01614856
## accenture                  0.01609669
## google                     0.01494275
## oracle                     0.01332736

4.2 Plot the distribution(histogram) of weighted and unweighted PageRanks. Comment on the differences, and explain why these might be different.

ggplot(df_pagerank_unweighted,aes(x=Page_Rank)) + geom_histogram(fill = "slategrey") +labs(title = "Histogram for PageRank(Unweighted)", x = "Page_Rank", y = "Frequency")

ggplot(df_pagerank_weighted,aes(x=Page_Rank)) + geom_histogram(fill = "slategrey") +labs(title = "Histogram for PageRank(Weighted)", x = "Page_Rank", y = "Frequency")

Answer: For unweighted page-rank, the distribution is much more uniform and it tells us that the migration between companies happens across a wide breadth of companies and there are lot of companies that individuals move to. However, upon looking at the weighted page-rank, it is fairly clear that the larger companies stand out in terms of the % of individuals moving to them as a function of their overall size. The page_rank for the top 10 companies when considering weights appears to be significantly higher than that of the smaller companies, and hence the distribution appears to be skewed towards the left.

5. Community Detection

5.1 . Walktrap is an algorithm for community detection that is computationally very efficient and that we discussed briefly in class. Use the walktrap.community command in igraph to detect communities of firms. You can play around with steps argument to get different communities, the default steps=4 seems to produce good results.

library(clustAnalytics)
walktrap <- walktrap.community(graph_df_talent_flows_weighted, steps = 4)
modularity(walktrap)

## [1] 0.3920425

plot(walktrap, graph_df_talent_flows_weighted, 
     vertex.color=membership(walktrap),
     layout = layout_with_fr,
            vertex.size = 3,
            vertex.label = NA,
            edge.label = NA,
            edge.color = "lightgrey",
            edge.curved = .1,
            edge.arrow.size = .1,
            edge.width = .5,
            asp = -.2,
            main = "Communities of Firms")

#### 5.2 . Inspect the members of each community. What commonalities do you observe within each community?

Answer: Firms belonging to the same industry have a tendency to be in the same community, which can be expected given that people will look for jobs that fit their skills and experiences and majority of job changes tend to be within the same industry. For instance, financial institutions such as Wells Fargo, JP Morgan Chase, Citi, US Bank and PNC Bank all belong to the same community. Similar trends can be observed across other industries as well.

membership(walktrap)

##                                 at&t                    colgate-palmolive 
##                                    2                                    4 
##                 agilent-technologies                                 ebay 
##                                    1                                    1 
##                              comcast                                  aon 
##                                    2                                    7 
##                     costco-wholesale                             facebook 
##                                    2                                    1 
##                           john-deere                          ross-stores 
##                                    4                                    2 
##                     american-express                               target 
##                                    2                                    2 
##                            cme-group                       jpmorgan-chase 
##                                    7                                    7 
##                      united-airlines                       the-home-depot 
##                                    2                                    2 
##                                xerox                           wellsfargo 
##                                    2                                    7 
##                               boeing                            jefferies 
##                                    2                                    7 
##                      cardinal-health                                cisco 
##                                    4                                    1 
##                                  ibm                            nordstrom 
##                                    2                                    2 
##                     johnson-controls                            honeywell 
##                                    4                                    4 
##                   unitedhealth-group                           exxonmobil 
##                                    7                                    5 
##                            microsoft                               netapp 
##                                    1                                    1 
##                         state-street                    eaton-corporation 
##                                    7                                    4 
##               church-&-dwight-co-inc                     general-dynamics 
##                                    4                                    2 
##                             gap-inc-                       alliant-energy 
##                                    2                                    5 
##                               intuit                       ppg-industries 
##                                    1                                    4 
##              kohls-department-stores                               cintas 
##                                    2                                    2 
##                                apple                       charles-schwab 
##                                    1                                    7 
##                     procter-&-gamble                              pepsico 
##                                    4                                    4 
##                              stryker                       analog-devices 
##                                    6                                    1 
##                             mckesson                    motorolasolutions 
##                                    2                                    1 
##                                fedex                             spglobal 
##                                    2                                    2 
##                      leggett-&-platt                  the-hershey-company 
##                                    2                                    4 
##                    applied-materials                            medtronic 
##                                    1                                    6 
##                                  ups                    johnson-&-johnson 
##                                    2                                    6 
##                             autodesk                                  hca 
##                                    1                                    7 
##                                   3m                            travelers 
##                                    4                                    7 
##                 zions-bancorporation                             broadcom 
##                                    7                                    1 
##                              abbott-           hewlett-packard-enterprise 
##                                    6                                    1 
##                              chevron                                  pvh 
##                                    3                                    2 
##                      kellogg-company                      bank-of-america 
##                                    4                                    7 
##                                 citi                            walgreens 
##                                    7                                    2 
##                         newellbrands                         dish-network 
##                                    4                                    2 
##                               mattel                              us-bank 
##                                    4                                    7 
##                    intel-corporation                     hilton-worldwide 
##                                    1                                    2 
##                        goldman-sachs                      parker-hannifin 
##                                    7                                    4 
##                          cummins-inc                         dow-chemical 
##                                    4                                    5 
##                              metlife                       american-water 
##                                    7                                    2 
##            lowe%27s-home-improvement                           yum-brands 
##                                    2                                    2 
##                       general-motors                   csx-transportation 
##                                    4                                    2 
##                             raytheon             marathon-oil-corporation 
##                                    2                                    3 
##                              walmart                   ford-motor-company 
##                                    2                                    4 
##          discover-financial-services                    mohawk-industries 
##                                    7                                    2 
##             fox-filmed-entertainment                             autozone 
##                                    2                                    2 
##                               citrix                     concho-resources 
##                                    1                                    3 
##                                 visa              american-electric-power 
##                                    1                                    5 
##                           cvs-health              the-kraft-heinz-company 
##                                    2                                    4 
##                               ameren               tractor-supply-company 
##                                    5                                    2 
##                                aimco                             pnc-bank 
##                                    9                                    7 
##                            h&r-block                               amazon 
##                                    2                                    1 
##                           mastercard                         schlumberger 
##                                    2                                    3 
##                                  adm              the-walt-disney-company 
##                                    4                                    2 
##             thermo-fisher-scientific                     eversourceenergy 
##                                    6                                    5 
##                               ecolab                 corning-incorporated 
##                                    4                                    4 
##                     sherwin-williams                host-hotels-&-resorts 
##                                    2                                    2 
##                        w.w.-grainger           arthur-j--gallagher-and-co 
##                                    2                                    7 
##                              paychex      the-estee-lauder-companies-inc- 
##                                    2                                    4 
##                               jacobs                      caterpillar-inc 
##                                    5                                    4 
##                            insidepmi                   idexx-laboratories 
##                                    4                                    4 
##                              textron                 willis-towers-watson 
##                                    4                                    7 
##                       tiffany-and-co                       conocophillips 
##                                    2                                    3 
##                   darden-restaurants               charter-communications 
##                                    2                                    2 
##                                 cboe                             illumina 
##                                    7                                    6 
##                             symantec                                alcoa 
##                                    1                                    4 
##                             fastenal                           borgwarner 
##                                    2                                    4 
##                        general-mills                               xilinx 
##                                    4                                    1 
##                         unionpacific                               pfizer 
##                                    2                                    6 
##                            cognizant                           nasdaq-omx 
##                                    2                                    2 
##                        ipg-photonics                        iron-mountain 
##                                    6                                    2 
##               vertex-pharmaceuticals             huntington-national-bank 
##                                    6                                    7 
##                                merck             the-j-m--smucker-company 
##                                    6                                    4 
##                                   ge                       avery-dennison 
##                                    4                                    4 
##            principal-financial-group                            flowserve 
##                                    7                                    5 
##                     consumers-energy                         baker-hughes 
##                                    5                                    3 
##                                 nike       jb-hunt-transport-services-inc 
##                                    4                                    2 
##                               google                               exelon 
##                                    1                                    5 
##                           salesforce                keysight-technologies 
##                                    1                                    1 
##            robert-half-international                                sysco 
##                                    2                                    4 
##                     maxim-integrated                                 tsys 
##                                    1                                    2 
##                               davita                         phillips66co 
##                                    2                                    3 
##                                 macy                norwegian-cruise-line 
##                                    2                                   10 
##                              labcorp                            dovercorp 
##                                    4                                    4 
##                              twitter                the-coca-cola-company 
##                                    1                                    4 
##               marriott-international                             allergan 
##                                    2                                    6 
##                 simon-property-group                                  tjx 
##                                    8                                    2 
##                         air-products                              gartner 
##                                    5                                    1 
##               people%27s-united-bank                          halliburton 
##                                    7                                    3 
##                               abbvie                             dominion 
##                                    6                                    5 
##                              equifax                              perrigo 
##                                    2                                    6 
##                               incyte                       northern-trust 
##                                    6                                    7 
##                       american-tower                   equity-residential 
##                                    2                                    9 
##                campbell-soup-company                      western-digital 
##                                    4                                    1 
##                         brown-forman                     hess-corporation 
##                                    4                                    3 
##                       morgan-stanley                         c-h-robinson 
##                                    7                                    2 
##                          foot-locker                         the-hartford 
##                                    2                                    7 
##                           con-edison               skyworks-solutions-inc 
##                                    5                                    1 
##                                  bd1                     firstenergy-corp 
##                                    6                                    5 
##                    boston-properties                          perkinelmer 
##                                    8                                    6 
##                whirlpool-corporation                              pentair 
##                                    4                                    4 
##                      electronic-arts                              cbs-com 
##                                    1                                    2 
##                       cimarex-energy                         zimmerbiomet 
##                                    3                                    6 
##                                  fis                                amgen 
##                                    7                                    6 
##                  international-paper                    texas-instruments 
##                                    4                                    1 
##                   the-clorox-company                               zoetis 
##                                    4                                    6 
##                                adobe                genuine-parts-company 
##                                    1                                    2 
##                     fifth-third-bank        regions-financial-corporation 
##                                    7                                    7 
##                         atmos-energy                     waste-management 
##                                    5                                    2 
##                              biogen-                   seagate-technology 
##                                    6                                    1 
##                      the-linde-group                              red-hat 
##                                    5                                    1 
##         northrop-grumman-corporation                    micron-technology 
##                                    2                                    1 
##                       ingersoll-rand                progressive-insurance 
##                                    4                                    7 
##                 freeport-mcmoran-inc                              verizon 
##                                    5                                    2 
##                 constellation-brands                      lyondell-basell 
##                                    4                                    5 
##                      regency-centers                            starbucks 
##                                    8                                    2 
##                              danaher                              celgene 
##                                    4                                    6 
##                       public-storage                      fmc-corporation 
##                                    2                                    5 
##                               nvidia        harley-davidson-motor-company 
##                                    1                                    4 
##                                  oxy                               carmax 
##                                    3                                    2 
##            valero-energy-corporation                            accenture 
##                                    5                                    2 
##                             newscorp                                mylan 
##                                    1                                    6 
##                                fluor                             allstate 
##                                    5                                    7 
##                 prudential-financial                               fiserv 
##                                    7                                    7 
##           newmont-mining-corporation                              nielsen 
##                                    5                                    4 
##               chipotle-mexican-grill                                  aes 
##                                    2                                    5 
##                                  aig                      gilead-sciences 
##                                    7                                    6 
##                          capital-one                 bristol-myers-squibb 
##                                    7                                    6 
##                             best-buy                         henry-schein 
##                                    2                                    4 
##                             qualcomm                               oracle 
##                                    1                                    1 
##                   dollar-tree-stores                                  pca 
##                                    2                                    4 
##                            dr-horton         raymond-james-financial-inc- 
##                                    2                                    7 
##             vulcan-materials-company                                 cbre 
##                                    5                                    8 
##            regeneron-pharmaceuticals                  united-technologies 
##                                    6                                    4 
##                              emerson                eli-lilly-and-company 
##                                    4                                    6 
##               national-oilwell-varco       franklin-templeton-investments 
##                                    3                                    7 
##    ameriprise-financial-services-inc                      lockheed-martin 
##                                    7                                    2 
##                                cigna                                  amd 
##                                    7                                    1 
##                   moodys-corporation             stanley-black-decker-inc 
##                                    7                                    4 
##                  nektar-therapeutics                            ansys-inc 
##                                    6                                    1 
##    pioneer-natural-resources-company                        kinder-morgan 
##                                    3                                    3 
##             mcdonald%27s-corporation                        citizens-bank 
##                                    2                                    7 
##                                 ulta                               humana 
##                                    2                                    7 
##                        suntrust-bank                    nucor-corporation 
##                                    7                                    5 
##                                  uhs                     juniper-networks 
##                                    7                                    1 
##                    helmerich-&-payne                               viacom 
##                                    3                                    2 
##                             m&t-bank                               kroger 
##                                    7                                    2 
##                                   hp                         lam-research 
##                                    1                                    1 
##            mgm-resorts-international                                  adp 
##                                    2                                    2 
##                     williams-company              duke-energy-corporation 
##                                    3                                    5 
##                      westrockcompany                         rollins-inc. 
##                                    4                                    2 
##                         devon-energy               cadence-design-systems 
##                                    3                                    1 
##                        western-union                        t--rowe-price 
##                                    2                                    7 
##                          f5-networks                   cerner-corporation 
##                                    1                                    2 
##                   berkshire-hathaway                                aflac 
##                                    7                                    2 
##                               lennar                torchmark-corporation 
##                                    2                                    7 
##                           dte-energy                   advance-auto-parts 
##                                    5                                    2 
##                           activision                         under-armour 
##                                    1                                    2 
##                      global-payments                mondelezinternational 
##                                    7                                    4 
##                               hasbro                                chubb 
##                                    4                                    7 
##                        priceline-com                               paccar 
##                                    2                                    2 
##                                 ball                          tyson-foods 
##                                    4                                    2 
##                      royal-caribbean                    masco-corporation 
##                                   10                                    4 
##                     hanesbrands-inc-                       kimberly-clark 
##                                    4                                    4 
##                           ims-health                                  ihs 
##                                    6                                    5 
##                 garmin-international                        alliance-data 
##                                    1                                    7 
##                   intuitive-surgical                    amerisourcebergen 
##                                    6                                    4 
##   the-cincinnati-insurance-companies                             assurant 
##                                    7                                    7 
##                       limited-brands                           bny-mellon 
##                                    2                                    7 
##                              hologic                     southern-company 
##                                    6                                    5 
##                          xcel-energy              lincoln-financial-group 
##                                    5                                    7 
##           marathon-petroleum-company                       wynn-las-vegas 
##                                    5                                    2 
##         kansas-city-southern-railway                             verisign 
##                                    2                                    1 
##                    american-airlines                           pultegroup 
##                                    2                                    2 
##                   southwest-airlines       anadarko-petroleum-corporation 
##                                    2                                    3 
##             eastman-chemical-company                            mccormick 
##                                    5                                    4 
##                                 coty                             teleflex 
##                                    4                                    6 
##                                  iff               sealed-air-corporation 
##                                    4                                    4 
##                           kla-tencor                            blackrock 
##                                    1                                    7 
##                             prologis                  rockwell-automation 
##                                    8                                    4 
##                      te-connectivity             discovery-communications 
##                                    4                                    2 
##                       dentsplysirona                               waters 
##                                    6                                    6 
##                              equinix                          centurylink 
##                                    1                                    2 
##                               paypal                       vf-corporation 
##                                    1                                    4 
##                carnival-cruise-lines                  svb-financial-group 
##                                   10                                    1 
##                                  itw                     norfolk-southern 
##                                    4                                    2 
##                avalonbay-communities              alexion-pharmaceuticals 
##                                    9                                    6 
##            hollyfrontier-corporation                           xylem-inc- 
##                                    3                                    4 
##                             celanese                  synchrony-financial 
##                                    5                                    7 
##                        sempra-energy                             macerich 
##                                    2                                    8 
##                         ralph-lauren                           nrg-energy 
##                                    2                                    5 
##                       united-rentals                               ametek 
##                                    2                                    4 
##                    boston-scientific                  akamai-technologies 
##                                    6                                    1 
##                               etrade                      alaska-airlines 
##                                    7                                    2 
##            martin-marietta-materials             nextera-energy-resources 
##                                    5                                    5 
##    mid-america-apartment-communities                   harris-corporation 
##                                    8                                    2 
##      marsh-&-mclennan-companies-inc-                                 bb&t 
##                                    7                                    7 
##                republic-services-inc                               altria 
##                                    2                                    4 
##                 edwards-lifesciences                      delta-air-lines 
##                                    6                                    2 
##                                  ipg                              entergy 
##                                    2                                    5 
##                              keybank       broadridge-financial-solutions 
##                                    7                                    7 
##                 microchip-technology                    baxter-healthcare 
##                                    1                                    6 
##                               anthem                 vornado-realty-trust 
##                                    7                                    8 
##                              netflix                         molson-coors 
##                                    1                                    4 
##                           expeditors                       dollar-general 
##                                    2                                    2 
##                         flir-systems                         michael-kors 
##                                    1                                    2 
##                              expedia                            welltower 
##                                    1                                    8 
##                     verisk-analytics                     align-technology 
##                                    7                                    1 
##              jack-henry-&-associates                 essex-property-trust 
##                                    2                                    9 
##                                 pseg                             fleetcor 
##                                    5                                    2 
##                             wellcare                      ppl-corporation 
##                                    7                                    5 
##                      cabot-oil-&-gas                          invesco-ltd 
##                                    3                                    7 
##                        digitalrealty                               resmed 
##                                    8                                    6 
##                         hormel-foods                          allegion-us 
##                                    4                                    4 
##                  extra-space-storage                o%27reilly-auto-parts 
##                                    2                                    2 
##                         crown-castle                         weyerhaeuser 
##                                    2                                    4 
##                             nisource                                 unum 
##                                    5                                    7 
##                        mosaiccompany                   centerpoint-energy 
##                                    5                                    5 
##                             msci-inc                  centene-corporation 
##                                    7                                    7 
##     take-2-interactive-software-inc-                          tripadvisor 
##                                    1                                    1 
##                        comerica-bank                        snap-on-tools 
##                                    7                                    4 
##               varian-medical-systems                               copart 
##                                    1                                    2 
##              duke-realty-corporation                              abiomed 
##                                    8                                    6 
##                                oneok                                qorvo 
##                                    3                                    1 
##                            albemarle                      lkq-corporation 
##                                    5                                    2 
##                         noble-energy                                  udr 
##                                    3                                    9 
##                        cf-industries            realty-income-corporation 
##                                    5                                    1 
##                             fortinet                  first-republic-bank 
##                                    1                                    7 
##                 edison-international                                  hcp 
##                                    5                                    8 
##                   sba-communications                     wec-energy-group 
##                                    2                                    5 
##        intercontinentalexchange-inc-          diamondback-energy-services 
##                                    7                                    3 
## alexandria-real-estate-equities-inc-                  everest-reinsurance 
##                                    8                                    7 
##             kimco-realty-corporation                   apache-corporation 
##                                    8                                    3 
##                        eog-resources                     monster-energy_2 
##                                    3                                    4 
##                             sl-green      federal-realty-investment-trust 
##                                    8                                    8 
##                          ventas-inc. 
##                                    8

6.

6.1. In your own words explain what assortative mixing means.

Answer: Assortativity is a bias in favor of connections between network nodes with similar characteristics and behavior. That is to say, the more similar the entities are, the more likely the entities will interact. In class, smokers were given as an example of shared behavior which leads to Assortativeity mixing. Another example with high school grades, is higher performing students are more likely to share similar behavior with other high performing students. Similarly, low performing students are more likely to share similar behavior with low performing students. This may be due to ability and being placed in the same classes. This leads to more interaction and stronger relationships.

6.2. Calculate the level of assortative mixing for industry, and interpret your results. HINT: Use the assortativity command in igraph. The types1 parameter should be a vector of industries corresponding to the nodes (firms).

Answer: Using assortativity(), we get the following assortativity mixing for industry:

assortativity(graph_df_talent_flows_weighted , V(graph_df_talent_flows_weighted), types2 = NULL, directed = TRUE)

## [1] -0.06063415

6.3. What are some other characteristics (come up with 2) along which you may find assortative mixing in the talent flow graph. What about disassortative mixing (come up with 1)? This is an open-ended question and the characteristics don’t have to be in the dataset but make sure to explain your answers.

Answer: Some of the characteristics, assortative mixing in the talent flow graphs may include industry(line of business), and company size. This can be explained because people are creatures of habits. As an example, person who is intimate with the advertising industry, may want to move to another company who works in advertising. From the company perspective, the company may only want to hire someone who understands the industry, because they believe the new hire may perform better. As for company size, people who changes jobs, may look for companies that have similar size. For example, someone who works in a 10k+ persons company may want to remain in a large company, because if the person moves to a start-up the person may be asked to wear multiple hats. This person may not want that to happen, therefore, sticks to similar size companies.

A characteristic of dissortative mixing may be company financial/ownership status. One way to describe it would be if a company gets acquired by another company. This is also a way personnel move from one company to another.