Network Science Assignment#2 Danilo Martinez

1.(4 points) Consider a large random network (i.e., use Poisson Approximation) whose average degree is 10. We use cummulative density. i) What is the probability that a node chosen randomly has a degree more than 20? Poisson approximation with a large N is: P(k) = e^- ^k/k! P(k>20) 1 - P(20) It is approximately

ppois(20,10,lower.tail = FALSE,log.p = FALSE)

## [1] 0.001588261

What is the probability that a node chosen randomly has a degree less than 2? P(k) = e^- ^k/k! P(k<2) P(1) It is approximately

ppois(1,10,lower.tail = TRUE,log.p = FALSE)

## [1] 0.0004993992

What is the probability that a node chosen randomly has a degree between 8 and 12, including 8 and 12? P(k) = e^- ^k/k! P(7<k<12) P(12) - P(7) P(12)

ppois(12,10,lower.tail = TRUE,log.p = FALSE)

## [1] 0.7915565

P(7)

ppois(7,10,lower.tail = TRUE,log.p = FALSE)

## [1] 0.2202206

P(12) - P(7) It is approximately

0.7915565 - 0.2202206

## [1] 0.5713359

(4 points) What do we mean when we say that a network does not have any scale (i.e., a scale free network)? Power Law is P of k is k raised to negative gamma. Most real networks follow power law, Networks whose degree distribution follow the power law are called scale free. Hubs exist with high degrees and distance is smaller as presence of hubs makes connections shorter. In random networks, most nodes have a similar average degree, but not in scale free networks as hubs are present. Therefore, when plotting their spread degree distribution, they are in fact “scale free”, and we have to use a log-log scale to represent them.
(4 points) Explain why the degree distribution for WWW was shown in log-log scale whereas you used the linear scale to depict the same for the random network that you generated for HW 1. In a random network,the degree distribution is centered around the average degree with no presence of hubs, but not so in a scale free network such as the www. There are large diffferences in the degree distribution with hubs present and having a high degree. When we try to plot large differences, we use log-log scale to make the differences use a smaller scale. This helps to visualize the differences on a graph. Furthermore, the www is a directed network so it has Kin and Kout degrees with gamma not necessarily being equal. We can plot the before mentioned seperately or together in a log-log scale.
(8 points) Consider two random networks: one with N=10,000 and the other with N=100,000. For both, the connection probability between any two nodes is 0.3. Find:

What is the expected number of links for both the networks? N = 10,000 G(N,P) ((N 2) / 2) * P Number of possible links (N 2) = N (N-1) / 2 Probability of link = .3 Expected number of links is

((10000 * 9999) / 2) * .3

## [1] 14998500

N = 100,000 G(N,P) ((N 2) / 2) * P Number of possible links (N 2) = N (N-1) / 2 Probability of link = .3 Expected number of links is

((100000 * 99999) / 2)*.3

## [1] 1499985000

For the first network and using Poisson approximation, what is the probability of having the expected number of links per node(use the answer in (a)). We need to calculate the average degree for the first network. N = 10,000 = 2*L/N

2*14998500/10000

## [1] 2999.7

The average degree is approximately 3000 Then we calculate the Poisson Distribution probability for the expected links, which is approximately

dpois(3000,3000, log = FALSE)

## [1] 0.007283454

For the second network and using Poisson approximation, what is the probability of having the expected number of links per node(use the answer in (a)). We need to calculate the average degree for the first network. N = 100,000 = 2*L/N

2*1499985000/100000

## [1] 29999.7

The average degree is approximately 30000 Then we calculate the Poisson Distribution probability for the expected links, which is approximately

dpois(30000,30000, log = FALSE)

## [1] 0.002303288

For both networks, find the probability that the degree of a randomly chosen node will lie between the expected no. of links per node plus minus 100. We use cummulative density. N = 10,000 P(3000+100) - P(3000-100) P(2900<k<3100) P(3100)

ppois(3100,3000,lower.tail = TRUE,log.p = FALSE)

## [1] 0.9662097

P(2900)

ppois(2900,3000,lower.tail = TRUE,log.p = FALSE)

## [1] 0.03409598

P(3100) - P(2900)

0.9662097 - 0.03409598

## [1] 0.9321137

N = 100,000 P(30000+100) - P(30000-100) P(29900<k<30100) P(30100)

ppois(30100,30000,lower.tail = TRUE,log.p = FALSE)

## [1] 0.7193377

P(29900)

ppois(29900,30000,lower.tail = TRUE,log.p = FALSE)

## [1] 0.2830452

P(30100) - P(29900)

0.7193377 - 0.2830452

## [1] 0.4362925

(4 points) What is the average distance between two Facebook users? Mention the source (complete reference including date) of your finding. Also, mention the size of the network and the number of connections. List all the findings if you have more than one. According to Facebook, the average degree of seperation between users at a global level is 3.57, but for the US, it is shorter at 3.46. Facebook has 1.59 billion active users today, and it resembles the characteristics of a scale free network. Connections/friendships are estimated to be in the hundreds of billions. The average distance for the global level stated is reported at 4.57. Average distance is equal to log N divided by log of average k. The small world property shows that distance is proportional logarithmically on the system size. Therefore, it is proportional to log N, not N.
While researching the internet, one thing that I noticed was that the average degree has decreased with time, as Facebook users increased. This remains in line with our study of scale free networks. The US having a lower degree of separation could be due to more technology access and more facebook users in the US population. Facebook does have a limit on the amount of friends(connections) a user can have, and we can question the “quality” of the links/friendships that exist. With these in mind tho, one can estimate that if Facebook continues to add users, the world will become more connected, and the degrees of seperation between users will drop some more.

References Date February 8th, 2016 http://www.newsweek.com/facebook-finds-just-35-degrees-separation-between-users-423991 Date February 5th, 2016 http://www.techtimes.com/articles/130858/20160205/six-degrees-of-separation-on-facebook-it-s-just-3-57-degrees.htm *Date February 4th, 2016 https://research.facebook.com/blog/three-and-a-half-degrees-of-separation/

(4 points) What happens to a random network when the probability of link connections increases from 0 to 1? Explain what is phase transition and when does it take place? As the probability of connection increases, the more connected the network is. As the probaility of connections increase, we separate the transition into phases, and they are:

Sub-Critical Phase. Average Degree is less than 1 No giant component Isolated clusters Largest cluster resembles a tree

2)Critical Phase. Average Degree equals 1 Formation of a giant component There is a jump in cluster size

3)Super-Critical Phase. Average Degree is greater than 1 Giant component will have loops There may be isolated clusters

When the average degree is greater the logN, there will be one connected component, one cluster, and the giant component will be dense. If the probability of link connections is 1, then it is a complete graph. Every node will be connected to every other node in the network.

(3 points) What are the implications on having a very high degree divergence on i) robustness, ii) spread, and iii) attack models. As N increases, divergence goes to infinity. Scale free networks have a higher degree divergence, which demostrate a characteristic related to hub existence. In a random network, the probability of the existence of hubs is low, while in a scale free network hubs occur naturally. This decreases the average distance in a scale free network which in turn decreases robustness, increases spread, and can help in attack models. For example,we can pin point hubs and attack them in a certain order, to bring down the network.
(3 points) Why some of the real networks are scale-free and some are not? Some real networks have varying limitations, and are not truly scale free. Examples of these limitations are related to the networks we are observing themselves, and their characteristics. For example, our interstate sytem has over passes or connections, but we cannot have an endless number of connections, or over passes built. This is due to engineering limitations.