The results are obtained by setting the parameters of the diffusion process to the following:
response rate is set to 0.04;repost rate is set to 0.002;no of seeds is set to 50.The diffusion process proceeds by first contacting seeds and see if they respond. If not, we will go to next set of seeds until 50 seeds are reached. The seeds are ranked by the degree method or the community seeding method.
For the community seeding method, we first divide the network into 50 communities. Then, we rank the seeds within each community and start from the higher ranked seeds to lower ranked seeds.
The response rate and the repost rate are assumed to be independent from the degree of seeds. We can calculate the correlation from our pretest, but the calculation needs to be used with caution.
First, bind the two data frames for degree and community seeding into a big data for plotting.
load("temp.RData")
result <- rbind(result_comm, result_degree)
Plot the results for # of impressions and compare them.
library(ggpubr)
## Loading required package: ggplot2
p <- ggboxplot(result, x = "method", y = "no_informed",
color = "method", palette = "jco", add = "mean",
xlab = "Seeding Methods", ylab = "No. of Impressions")
# Add p-value
p + stat_compare_means()
To better visualize it, we can also plot the histogram of the results for the two methods.
gghistogram(result, x = "no_informed", y = "..density..",
add = "mean", color = "method", fill = "method",
xlab = "Seeding Methods", palette = "jco")
Next, we use a permutation test to see the expected chance of running the experiment for once and getting better seeding results for community seeding.
probability <- 0
for (i in 1:10000) {
probability <- probability +
(sample(result_comm$no_informed,1) >
sample(result_degree$no_informed,1))
}
print(paste("The chance of having one experiment with better seeding results for community approach is:",
as.character(probability/10000),sep = " "))
## [1] "The chance of having one experiment with better seeding results for community approach is: 0.9294"