Mobile Network Traffic Prediction Use Case 1

14 May, 2020

Network Development Analytics (NDA) Team

Network Intelligence and Performance Management (NIPM)

NIPM is the Telekom Malaysia department delivering management and technical team with analysis and Webe is the Mobility Centers of Excellence of the TM Group.
Webe technicians and engineers are requesting mobile network traffic prediction to assist in their technical operation, locate congested cells for improvement, and evaluate factors affecting mobile traffic utilization.
NIPM requires the construction of a model of analytics based on sample data. The task in analytics is to predict the throughput (download/upload) of Webe network traffic.
The future benefits of this project are being able to serve as a constructive approach to network management and planning and also to execute Smart Capex for Webe.
Any project results will be informed to the TM management and Webe team for further review and verification.

Any project results will be informed to TM management and Webe team for further review and verification.
For each particular node / cell , we plan to to predict the future trends in network throughput ( upload and download ).
Regularly, the model should be improved, and some acceptable accuracy should be set.
If the accuracy is below acceptable threshold the model should be revalidated. (i.e accuracy more than 75%)
Proposed using AutoML ( using H2o ) for the solution with multiple models will be built, and improved automation process.
The classifier must support both stationary and non-stationary data set for prediction (using Dickey-Fuller Test)

best model

The analytics model will be validated using a few metrics, including :-
- Error rate = (average_dl_pdcp_layer_throughput_mbps - predicted)
- Error percentage = ( error / average_dl_pdcp_layer_throughput_mbps * 100 ) , Variance achived , Standard Deviation , Mean Absolute Error (MAE)
- Root Mean Square Error (RMSE) , Mean Absolute Percentage Error (MAPE) - prediction accuracy of a forecasting method
- Mean Percentage Error (MPE) - forecasts of a model differ from actual values , Skewness rate - to the left or to the right , Kurtosis rate - measure of the tailedness

n	mean	var	std	mae	rmse	mape	mpe	skew	kurtosis
88	-0.01906283	0.3401222	0.5832	0.37808	0.5801901	0.3778038	-0.2381319	-0.1065638	2.867807

The outcome would help to increase stakeholder understanding of the situation surrounding network usage and predictive analytics.
It is possible to take constructive steps to enhance customer service by predicting network usage.
The current model is being built using the Distributed Random Forest (DRF) framework to boost prediction efficiency with multithreading process
For the current model, no Grid Search was involved for optimization (hyperparameter tuning) , only Cross-Validation with K-Fold (5) have been applied.

Example :

Predicted Download Throughput - CB0192_TM BUKIT ASA _011

Example :

Predicted Upload Throughput - CB0192_TM BUKIT ASA _011

Example :

Raw Dataset

The proposed use of K-means ( Hartigan-Wong, Lloyd, Forgy, MacQueen ) for clustering is for the next use case (Cell Node Classification)
Embedding the clustering approach for each Cell Node is based on the GPS location (lat, long) as well as the aggregate throughput/volume.
The analytics model should be automated and not retained with the same classifier (Proposed using AutoML)
Embedding the hyperparameter tuning for model optimization and improving accuracy of predictions.
Next use case 2 - Webe Network Traffic (Utilization) Prediction.
Next use case 3 - Webe Network Traffic Cell Node Classification.

The End