1 Summary

In this project, I used data that shows how good or bad the network signal was in different places. The data included things like how strong the signal was, how fast the internet was, and how long it took to send and receive data. I looked at the average signal quality for each place and used that to build a network.

2 Setup

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(igraph)
## 
## Attaching package: 'igraph'
## 
## The following objects are masked from 'package:lubridate':
## 
##     %--%, union
## 
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## 
## The following objects are masked from 'package:purrr':
## 
##     compose, simplify
## 
## The following object is masked from 'package:tidyr':
## 
##     crossing
## 
## The following object is masked from 'package:tibble':
## 
##     as_data_frame
## 
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## 
## The following object is masked from 'package:base':
## 
##     union
library(readr)
library(ggraph)
library(tidygraph)
## 
## Attaching package: 'tidygraph'
## 
## The following object is masked from 'package:igraph':
## 
##     groups
## 
## The following object is masked from 'package:stats':
## 
##     filter

3 Load and Prepare Data

signal_data <- read_csv("signal_metrics.csv")
## Rows: 16829 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): Locality, Network Type
## dbl  (9): Latitude, Longitude, Signal Strength (dBm), Signal Quality (%), Da...
## dttm (1): Timestamp
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
node_data <- signal_data %>%
  group_by(Locality) %>%
  summarize(
    signal_strength = mean(`Signal Strength (dBm)`, na.rm = TRUE),
    latency = mean(`Latency (ms)`, na.rm = TRUE),
    throughput = mean(`Data Throughput (Mbps)`, na.rm = TRUE)
  )

4 Create Network

features <- as.matrix(node_data[, -1])
rownames(features) <- node_data$Locality

dist_matrix <- as.matrix(dist(features))
threshold <- 10

edges <- which(dist_matrix < threshold & dist_matrix != 0, arr.ind = TRUE)

edge_list <- data.frame(
  from = rownames(dist_matrix)[edges[,1]],
  to = rownames(dist_matrix)[edges[,2]]
)

g <- graph_from_data_frame(d = edge_list, vertices = node_data, directed = FALSE)

5 igraph Network Plot

plot(g,
     vertex.label.cex = 0.7,
     vertex.label.color = "black",
     vertex.label.dist = 1,
     vertex.color = "gold",
     vertex.size = degree(g) * 1.5,
     edge.color = "gray70",
     edge.arrow.size = 0.3,
     layout = layout_with_fr,
     main = "Signal Similarity Network of Localities")

6 ggraph Visualization

graph_tbl <- as_tbl_graph(g)

ggraph(graph_tbl, layout = "fr") +
  geom_edge_link(alpha = 0.3, color = "gray70") +
  geom_node_point(aes(size = centrality_degree()), color = "steelblue", show.legend = FALSE) +
  geom_node_text(aes(label = name), repel = TRUE, size = 3) +
  scale_size_continuous(range = c(3, 10)) +
  labs(
    title = "Signal Similarity Network of Localities (ggraph)",
    subtitle = "Node size reflects centrality (degree)"
  ) +
  theme_graph()
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

7 Network Analysis

cat("Network Density:", edge_density(g), "\n")
## Network Density: 2
cat("Average Path Length:", average.path.length(g), "\n")
## Warning: `average.path.length()` was deprecated in igraph 2.0.0.
## ℹ Please use `mean_distance()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Average Path Length: 1
cat("Clustering Coefficient:", transitivity(g), "\n")
## Clustering Coefficient: 1
cat("Top Localities by Degree:\n")
## Top Localities by Degree:
print(head(sort(degree(g), decreasing = TRUE), 5))
##     Anandpuri      Anisabad Ashok Rajpath   Bailey Road     Bankipore 
##            38            38            38            38            38

8 Conclusion

This project helped me see how different places are connected based on how similar their network signals are. By turning the data into a network, I found out which localities had similar signal quality and which ones were more important in the network. The graphs and numbers showed that some places are more connected than others. This kind of analysis could be useful for understanding where the network is strong or where it might need to be improved.