Setup

base_data <- st_read(params$file)
## Reading layer `run3-backpack' from data source `C:\websites\fivegbp_locatienet_com\R\geodata\run3-backpack.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1500 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 4.477489 ymin: 51.02971 xmax: 4.483769 ymax: 51.04169
## Geodetic CRS:  WGS 84
base_data = base_data[base_data$speed > 0.5,]
data <- base_data

pop1 <- data[data$network == '4G',]
pop2 <- data[data$network == '5G',]

Data

Map

Unfiltered

Stats 4G

## [1] 888
##   pop1.latency      pop1.rsrp    pop1.arrived  
##  Min.   : 136.0   Min.   :-87   Min.   : 2.00  
##  1st Qu.: 276.0   1st Qu.:-87   1st Qu.:45.00  
##  Median : 394.5   Median :-87   Median :45.00  
##  Mean   : 420.2   Mean   :-87   Mean   :44.75  
##  3rd Qu.: 479.2   3rd Qu.:-87   3rd Qu.:45.00  
##  Max.   :2334.0   Max.   :-87   Max.   :72.00

Stats 5G

## [1] 372
##   pop2.latency      pop2.rsrp    pop2.arrived   
##  Min.   :  -1.0   Min.   :-97   Min.   :  0.00  
##  1st Qu.: 105.0   1st Qu.:-97   1st Qu.: 45.00  
##  Median : 120.0   Median :-97   Median : 45.00  
##  Mean   : 152.4   Mean   :-97   Mean   : 44.44  
##  3rd Qu.: 146.0   3rd Qu.:-97   3rd Qu.: 45.00  
##  Max.   :1508.0   Max.   :-97   Max.   :152.00

Scatter RSRP/Latency

## Warning: 'scattergl' trace types don't currently render in RStudio on Windows. Open in another web browser (IE, Chrome, Firefox, etc).

Histograms

Filtered

Remove outliers and Bad measurements and create equal size samples

data = data[data$latency > 0,]
data = data[data$latency < 1000,]
data = data[data$arrived == 45,]
pop1 <- data[data$network == '4G',]
pop2 <- data[data$network == '5G',]


n <- min(c(nrow(pop1), nrow(pop2)))
pop1 = pop1[sample(1:nrow(pop1)),]
pop1 <- head(pop1, n)
pop2 = pop2[sample(1:nrow(pop2)),]
pop2 <- head(pop2, n)

data = rbind(pop1, pop2)

Stats 4G

## [1] 355
##   pop1.latency     pop1.rsrp    pop1.arrived
##  Min.   :144.0   Min.   :-87   Min.   :45   
##  1st Qu.:278.5   1st Qu.:-87   1st Qu.:45   
##  Median :393.0   Median :-87   Median :45   
##  Mean   :402.9   Mean   :-87   Mean   :45   
##  3rd Qu.:466.0   3rd Qu.:-87   3rd Qu.:45   
##  Max.   :966.0   Max.   :-87   Max.   :45

Stats 5G

## [1] 355
##   pop2.latency     pop2.rsrp    pop2.arrived
##  Min.   : 83.0   Min.   :-97   Min.   :45   
##  1st Qu.:106.5   1st Qu.:-97   1st Qu.:45   
##  Median :120.0   Median :-97   Median :45   
##  Mean   :141.2   Mean   :-97   Mean   :45   
##  3rd Qu.:144.0   3rd Qu.:-97   3rd Qu.:45   
##  Max.   :953.0   Max.   :-97   Max.   :45

Scatter RSRP/Latency

## Warning: 'scattergl' trace types don't currently render in RStudio on Windows. Open in another web browser (IE, Chrome, Firefox, etc).

Histograms

QQ plot on Latency

## [1]  94 369

Because of lack of overlap the sample is assumed not normal distributed.

Tests

If the distribution of Latency values is not normal, non-parametric tests, such as the Wilcoxon rank test or the Kruskal-Wallis test, can be used to determine the significance of the differences between the groups. These tests do not make assumptions about the distribution of the data and are therefore useful when analyzing non-normally distributed data. As the resulting p-value is less than the significance level 0.05, we can conclude that there are significant differences between the populations.

Dunn’s test, also known as the Dunn’s post-hoc test or the Dunn-Bonferroni test, is a pairwise comparison test that is often used as a follow-up analysis after performing a Kruskal-Wallis test. The purpose of Dunn’s test is to identify which specific groups differ significantly from each other when there is a significant result in the Kruskal-Wallis test.

Kruskall-Wallis

kruskal.test(latency ~ network, data = data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  latency by network
## Kruskal-Wallis chi-squared = 464.73, df = 1, p-value < 2.2e-16

Wilcoxon

pairwise.wilcox.test(data$latency, data$network, p.adjust.method = "bonferroni")
## 
##  Pairwise comparisons using Wilcoxon rank sum test with continuity correction 
## 
## data:  data$latency and data$network 
## 
##    4G    
## 5G <2e-16
## 
## P value adjustment method: bonferroni

Dunn’s Test

as.data.frame(data) %>% rstatix::dunn_test(latency ~ network, p.adjust.method = "bonferroni", detailed = FALSE)
  • .y.: the y (outcome) variable used in the test.
  • group1,group2: the compared groups in the pairwise tests.
  • n1,n2: Sample counts.
  • statistic: Test statistic (z-value) used to compute the p-value.
  • p: p-value.
  • p.adj: the adjusted p-value.
  • p.signif, p.adj.signif: the significance level of p-values and adjusted p-values, respectively.

In a Dunn’s test:

In the context of a Dunn’s test, the Z-value is typically used to determine the significance of the pairwise comparisons between groups. The Z-value in this case represents the standardized difference in the average ranks between two groups being compared.

Here’s how to interpret the Z-value in Dunn’s test:

  1. Calculate the Z-value:
    • In Dunn’s test, the Z-value is typically calculated by dividing the absolute difference between the average ranks of two groups by the estimated standard error.
  2. Positive or negative Z-value:
    • A positive Z-value indicates that the first group being compared has a higher average rank than the second group.
    • A negative Z-value indicates that the first group being compared has a lower average rank than the second group.
  3. Magnitude of the Z-value:
    • The magnitude of the Z-value represents the number of standard errors the difference in average ranks is away from zero.
    • A larger absolute Z-value indicates a greater difference in average ranks between the two groups being compared.

Remember that Dunn’s test is a non-parametric test and does not make assumptions about the distribution of the data. The Z-values in Dunn’s test are based on the ranks rather than the raw data values.

```