Diesel Sustainability Analysis Across Telecom Network Sites
Author
Victoria Idegwu | Matric No: 2025-MMBA-08-0032
Published
May 13, 2026
Executive Summary
This study examines diesel sustainability across selected telecom network sites using operational spot-check data from IHS Towers. As a Manager, Regional Technical Operations, I work closely with site performance, diesel monitoring, grid availability, vendor follow-up, and operational stability. These areas are important because many telecom sites still depend on diesel generators when grid power is unavailable, unstable, or insufficient. If diesel levels and consumption patterns are not properly monitored, sites can move quickly towards dry-date positions, creating avoidable service risks.
The dataset used for this analysis contains 118 site observations. It includes site ID, grid availability, diesel level at spot-check, date of spot-check, field consumption per day (CPD), and available days. Available days was used as the main outcome variable because it is the operational KPI used by the team to estimate how long each site can continue operating before reaching a critical diesel position.
The study applied exploratory data analysis, visualisation, hypothesis testing, correlation analysis, and regression analysis. The results show that diesel level has a positive relationship with available days, while CPD has a negative relationship with available days. The analysis also shows that grid availability makes a statistically significant difference in available days.
The regression model explains about 72.5% of the variation in available days. Diesel level, grid availability, and CPD were all statistically significant explanatory variables for the available-days KPI. The findings support a more risk-based approach to diesel planning, where sites are prioritised using available days, CPD, diesel level, and grid status together rather than diesel level alone.
1. Introduction
Telecom network sites require stable power to remain available to customers and mobile network operators. In practice, grid power is not always reliable across all locations. Some sites have grid supply, while others depend more heavily on diesel generators. This makes diesel sustainability an important operational concern.
In regional technical operations, one of the key questions is not only whether a site has diesel, but how long that diesel can realistically sustain the site. A site with a moderate diesel level may still be at risk if its daily consumption is high. In the same way, a site with grid availability may preserve diesel for longer than a site operating mainly on generator supply.
This study focuses on the relationship between diesel level, grid availability, CPD, and available days across selected telecom network sites. The aim is to use actual operational data to identify the factors that influence diesel sustainability and to support better site prioritisation.
The business question guiding the analysis is:
How do diesel level, grid availability, and consumption per day affect available days across telecom network sites?
2. Professional Disclosure
I work with IHS Towers as an MRTO — Manager, Regional Technical Operations. IHS Towers is a telecommunications infrastructure company that provides and manages critical infrastructure used by mobile network operators. My role involves monitoring site performance, diesel sustainability, grid availability, vendor response, and general operational stability across network sites.
Diesel monitoring is directly linked to my day-to-day work because a site with poor diesel sustainability can quickly become an operational risk. If such sites are not identified early, they may require emergency refuelling, escalation, or urgent vendor intervention. This can affect network availability and increase operational pressure on the regional team.
The five techniques used in this study are relevant to this operational context. Exploratory Data Analysis helps me understand the current condition of the dataset and identify unusual values or gaps. Visualisation helps present the pattern of diesel level, CPD, grid availability, and available days in a way that is easy for operations teams to understand. Hypothesis testing helps check whether observed differences or relationships are statistically meaningful. Correlation analysis helps show the strength and direction of relationships among key variables. Regression analysis helps estimate how diesel level, grid availability, and CPD jointly explain the available-days KPI.
These techniques are also aligned with the course textbook’s focus on using data analytics methods to support business decision-making. In this case, the analysis supports practical decisions around diesel planning, site prioritisation, and proactive intervention.
3. Data Collection and Sampling
The dataset was collected from operational diesel spot-check records across selected telecom network sites. The records contain site-level information on diesel level, grid availability, field CPD, date of spot-check, and available days.
The sampling frame consists of telecom sites under regional technical operations monitoring. Each observation represents one site spot-check record. The dataset contains 118 observations, which satisfies the minimum requirement of at least 100 observations for this assessment.
The variables used in this study are:
Site_ID: Unique site identifier.
Grid_Availability: Whether grid power was available at the site.
Diesel_Level: Diesel volume recorded during the spot-check.
Spotcheck_Date: Date the site diesel position was captured.
CPD: Field consumption per day.
Available_Days: Estimated number of days the available diesel can sustain the site. This was obtained from the team’s operational diesel monitoring records and is based on diesel level and field CPD.
Available days is used as an operational sustainability KPI by the team because it translates diesel level and consumption rate into a practical planning indicator. It helps identify which sites are closest to dry-date risk and should be prioritised for intervention.
The data was used for academic analysis and does not include customer personal information. The site IDs are used only for operational identification. The dataset is treated as internal operational data and is cited in the references section as primary organisational data collected from regional technical operations spot-check records.
4. Data Loading and Preparation
Code
# Load packageslibrary(tidyverse)library(ggplot2)library(corrplot)library(broom)library(knitr)library(readxl)# Cleaner output settingsknitr::opts_chunk$set(message =FALSE, warning =FALSE)# Load datasetmy_data <-read_excel("DA Exam Data.xlsx")# Rename columns for easier analysiscolnames(my_data) <-c("SN","Site_ID","Grid_Availability","Diesel_Level","Spotcheck_Date","CPD","Available_Days")# Convert Grid Availability to numeric# Yes = 1, No = 0my_data$Grid_Availability <-ifelse(trimws(my_data$Grid_Availability) =="Yes",1,0)# View first few rowshead(my_data)
The dataset contains both numerical and categorical variables. Diesel level, CPD, and available days are numerical variables. Grid availability is a categorical variable that was converted into a binary numeric variable for analysis, where 1 represents sites with grid availability and 0 represents sites without grid availability.
SN Site_ID Grid_Availability Diesel_Level
Min. : 1.00 Length :118 Min. :0.0 Min. : 30.0
1st Qu.: 30.25 N.unique :118 1st Qu.:0.0 1st Qu.: 200.2
Median : 59.50 N.blank : 0 Median :0.5 Median : 300.0
Mean : 59.50 Min.nchar: 13 Mean :0.5 Mean : 486.3
3rd Qu.: 88.75 Max.nchar: 13 3rd Qu.:1.0 3rd Qu.: 495.2
Max. :118.00 Max. :1.0 Max. :2382.0
Spotcheck_Date CPD Available_Days
Min. :2026-04-08 00:00:00 Min. : 10.00 Min. : 0.600
1st Qu.:2026-05-09 00:00:00 1st Qu.: 15.00 1st Qu.: 4.079
Median :2026-05-11 00:00:00 Median : 40.00 Median :10.430
Mean :2026-05-05 15:51:51 Mean : 41.19 Mean :13.997
3rd Qu.:2026-05-11 00:00:00 3rd Qu.: 58.00 3rd Qu.:23.363
Max. :2026-05-11 00:00:00 Max. :138.00 Max. :43.000
The summary statistics show that the dataset contains 118 site observations. The average diesel level is approximately 486.3 litres, the average CPD is approximately 41.19, and the average available days is approximately 14 days. The dataset also shows clear variation across sites. Some sites have very low available days, while others have much higher diesel sustainability.
From an operational point of view, this variation is important because it means that sites should not be treated equally during diesel planning. Sites with short available days require faster attention than sites with longer sustainability.
6. Exploratory Data Analysis
Exploratory Data Analysis was carried out to understand the dataset before applying more formal statistical methods. The focus was on checking the structure of the data, identifying missing values, reviewing possible outliers, and understanding the basic distribution of diesel level, CPD, and available days.
The missing value check shows that there are no missing values in the dataset. This means all 118 site observations are complete for the selected variables and can be used for the analysis.
Diesel_Level CPD Available_Days
Min. : 30.0 Min. : 10.00 Min. : 0.600
1st Qu.: 200.2 1st Qu.: 15.00 1st Qu.: 4.079
Median : 300.0 Median : 40.00 Median :10.430
Mean : 486.3 Mean : 41.19 Mean :13.997
3rd Qu.: 495.2 3rd Qu.: 58.00 3rd Qu.:23.363
Max. :2382.0 Max. :138.00 Max. :43.000
Two data quality issues were considered during the EDA stage. First, missing values were checked across all variables, and the result showed no missing values. Therefore, no observation had to be removed due to incomplete data. Second, possible outliers were reviewed in diesel level, CPD, and available days. Some sites had very high diesel levels and available days compared with others, but these values were retained because they reflect realistic operational differences in site configuration, diesel holding capacity, grid condition, and consumption pattern.
This stage helped confirm that the dataset was suitable for analysis. It also showed that available days varies meaningfully across sites, which makes it a useful outcome variable for diesel sustainability analysis.
7. Data Visualisation
Visualisation was used to communicate the main patterns in the data. The charts focus on the relationships between diesel level, CPD, grid availability, and available days. These visualisations are useful because operations managers often need quick and clear insights for site prioritisation.
7.1 Diesel Level and Available Days
Code
ggplot(my_data, aes(x = Diesel_Level, y = Available_Days)) +geom_point() +geom_smooth(method ="lm", se =FALSE) +labs(title ="Diesel Level vs Available Days",x ="Diesel Level",y ="Available Days" )
The chart shows a positive relationship between diesel level and available days. This means that sites with higher diesel levels generally have more available days. This is expected operationally because diesel level is a key input in the available-days KPI.
7.2 CPD and Available Days
Code
ggplot(my_data, aes(x = CPD, y = Available_Days)) +geom_point() +geom_smooth(method ="lm", se =FALSE) +labs(title ="CPD vs Available Days",x ="Consumption Per Day",y ="Available Days" )
This chart shows the relationship between CPD and available days. Sites with higher CPD consume diesel faster, which can reduce available days. This is important because diesel volume alone may not give the full picture. A site may appear safe based on diesel level, but if its CPD is high, it may still require early intervention.
7.3 Grid Availability and Available Days
Code
ggplot(my_data, aes(x =factor(Grid_Availability), y = Available_Days)) +geom_boxplot() +labs(title ="Grid Availability vs Available Days",x ="Grid Availability (0 = No, 1 = Yes)",y ="Available Days" )
This chart compares available days between sites with and without grid availability. Sites with grid availability are expected to preserve diesel better because they do not depend on generator power all the time. This makes grid availability an important factor in diesel sustainability planning.
7.4 Distribution of Available Days
Code
ggplot(my_data, aes(x = Available_Days)) +geom_histogram(bins =15) +labs(title ="Distribution of Available Days",x ="Available Days",y ="Number of Sites" )
The histogram shows how available days are distributed across the selected sites. This helps identify whether most sites have reasonable diesel sustainability or whether many sites are clustered around low available days. For operations, sites with low available days are more urgent because they may approach dry-date positions faster.
7.5 Top 10 Sites with Lowest Available Days
Code
lowest_sites <- my_data %>%arrange(Available_Days) %>%slice_head(n =10)ggplot(lowest_sites, aes(x =reorder(Site_ID, Available_Days), y = Available_Days)) +geom_col() +coord_flip() +labs(title ="Top 10 Sites with Lowest Available Days",x ="Site ID",y ="Available Days" )
This chart identifies the ten sites with the lowest available days. These sites represent the most immediate diesel sustainability risks in the dataset. In a real operational setting, they should be prioritised for review, escalation, or diesel intervention.
8. Hypothesis Testing
Two hypothesis tests were conducted. The first test examined whether available days differ significantly between sites with grid availability and sites without grid availability. The second test examined whether diesel level has a statistically significant relationship with available days.
8.1 Hypothesis Test 1: Grid Availability and Available Days
The hypotheses are:
Null hypothesis (H0): There is no significant difference in available days between sites with grid availability and sites without grid availability.
Alternative hypothesis (H1): There is a significant difference in available days between sites with grid availability and sites without grid availability.
Code
# Hypothesis test: Available Days by Grid Availabilityt_test_grid <-t.test( Available_Days ~ Grid_Availability,data = my_data)t_test_grid
Welch Two Sample t-test
data: Available_Days by Grid_Availability
t = -2.0986, df = 115.87, p-value = 0.03802
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-8.1141917 -0.2347052
sample estimates:
mean in group 0 mean in group 1
11.91011 16.08456
The hypothesis test produced a p-value of 0.038. Since this is below the 0.05 significance level, the null hypothesis is rejected. This means there is a statistically significant difference in available days between sites with grid availability and sites without grid availability.
The result also shows that sites without grid availability had an average of about 11.91 available days, while sites with grid availability had an average of about 16.08 available days. In business terms, this suggests that grid availability should be considered when prioritising sites for diesel intervention. Sites without grid availability are more exposed because they depend more heavily on diesel generation.
8.2 Hypothesis Test 2: Diesel Level and Available Days
The hypotheses are:
Null hypothesis (H0): There is no significant relationship between diesel level and available days.
Alternative hypothesis (H1): There is a significant relationship between diesel level and available days.
Code
# Hypothesis test: Relationship between Diesel Level and Available Dayscor_test_diesel <-cor.test( my_data$Diesel_Level, my_data$Available_Days)cor_test_diesel
Pearson's product-moment correlation
data: my_data$Diesel_Level and my_data$Available_Days
t = 8.9236, df = 116, p-value = 7.832e-15
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.5168497 0.7341023
sample estimates:
cor
0.6380033
The correlation test produced a p-value of 7.832e-15, which is far below the 0.05 significance level. Therefore, the null hypothesis is rejected. This means there is a statistically significant positive relationship between diesel level and available days. The correlation coefficient is 0.638, indicating a moderately strong positive relationship.
Operationally, this confirms that diesel level is a meaningful input in the available-days KPI. However, it should not be used alone because CPD and grid availability also influence how long diesel will last.
9. Correlation Analysis
Correlation analysis was used to examine the strength and direction of the relationship between diesel level, CPD, and available days.
The correlation analysis shows a positive correlation of 0.638 between diesel level and available days. This is expected because available days is an operational KPI based partly on diesel level. Higher diesel levels generally increase the number of days a site can continue operating.
The analysis also shows a negative correlation of -0.292 between CPD and available days. This is also operationally sensible because higher daily consumption reduces how long available diesel can sustain a site.
Although available days is based on diesel level and CPD, the correlation analysis remains useful because it shows how strongly these operational variables move together across the selected sites. It also reinforces the need to consider CPD alongside diesel level during diesel planning.
The relationship between diesel level, CPD, and available days is the most operationally direct relationship in the dataset because available days is based on diesel level and CPD. However, correlation alone should not be treated as full proof of causality. To confirm the operational effect more strongly, I would track the same sites over time before and after diesel refill or grid-restoration events. This would show whether changes in diesel level, CPD, or grid availability lead to measurable changes in available days and site risk.
10. Regression Analysis
Regression analysis was used to examine the combined relationship between diesel level, grid availability, CPD, and the available-days KPI.
The regression result shows that diesel level, grid availability, and CPD are all statistically significant explanatory variables for the available-days KPI. Diesel level has a positive coefficient of 0.0197, meaning that higher diesel volume is associated with higher available days. CPD has a negative coefficient of -0.1188, meaning that higher daily consumption reduces available days. Grid availability has a positive coefficient of 7.2859, suggesting that sites with grid availability tend to have higher available days than sites without grid availability, holding other variables constant.
The regression model should be interpreted as an explanatory model of the available-days KPI, not as proof of independent causation. Since Available_Days is an operational KPI derived from diesel level and CPD, the model helps show how these inputs relate to the sustainability indicator across sites. Grid availability is included to provide additional operational context because grid-powered sites may depend less on diesel generation.
The model has an R-squared value of 0.7247, meaning that diesel level, grid availability, and CPD together explain about 72.5% of the variation in available days across the sites. This is useful for operational planning because it confirms that diesel sustainability should be assessed using a combination of diesel level, CPD, grid status, and available days.
10.1 Regression Diagnostic Plots
Code
par(mfrow =c(2, 2))plot(model)
Code
par(mfrow =c(1, 1))
The diagnostic plots were reviewed to assess whether the regression assumptions were reasonable. These plots help check linearity, residual spread, normality of residuals, and possible influential observations. Although some variation is expected in operational data, the model remains useful for explaining the main drivers of available days.
11. Integrated Findings
The analysis provides consistent evidence that diesel sustainability is mainly influenced by diesel level, CPD, and grid availability. The descriptive statistics show that the dataset contains 118 complete site observations, making it suitable for the selected analysis. The average available days across the dataset is about 14 days, but some sites have much lower available days and therefore require closer attention.
The visualisations show that diesel level and available days move in the same direction, while CPD and available days move in opposite directions. This means that higher diesel volume improves site sustainability, while higher consumption reduces it. The grid availability comparison also shows that sites with grid availability tend to have higher average available days than sites without grid availability.
The hypothesis tests strengthen the analysis. The first test shows that grid availability makes a statistically significant difference in available days. The second test shows that diesel level has a statistically significant positive relationship with the available-days KPI. Since available days is based on diesel level and CPD, this result is operationally expected, but it still confirms that diesel level is a meaningful input in site sustainability planning.
The regression result further shows that diesel level, grid availability, and CPD are all statistically significant explanatory variables for available days. This supports the operational view that diesel sustainability should be assessed using a combination of current diesel level, daily consumption, grid condition, and available days.
Overall, the findings support a risk-based approach. Sites should not be prioritised based only on diesel level. A more reliable approach is to consider diesel level, CPD, grid availability, and available days together when deciding which sites require urgent intervention.
12. Recommendations
Based on the analysis, the following recommendations are made:
Sites with low available days should be prioritised for diesel intervention because they are closer to critical diesel levels.
Sites with high CPD should be monitored more frequently because they consume diesel faster and can reach dry-date positions quickly.
Sites without grid availability should receive special attention because they are more dependent on diesel generators and recorded lower average available days.
Diesel planning should combine diesel level, CPD, grid availability, and available days rather than relying only on current diesel volume.
A simple risk-ranking dashboard should be developed to flag sites with low diesel level, high CPD, no grid availability, and short available days.
Regional operations teams should use available days as a daily planning indicator for diesel allocation, escalation, and vendor follow-up.
The top ten sites with the lowest available days should be reviewed daily until their diesel position improves, because they represent the most immediate operational risk.
13. Limitations and Further Work
The study is based on spot-check data collected over a limited period. Although the dataset is useful for understanding diesel sustainability, it does not include all possible operational factors that may affect available days.
For example, the analysis does not include generator condition, site load, access challenges, security issues, refuelling delays, vendor response time, or actual outage hours. These factors may also influence how quickly a site becomes operationally exposed.
Future analysis can include more time periods and additional operational variables. A larger dataset covering several weeks or months would support stronger trend analysis and forecasting. Future work can also include outage hours, generator runtime, site load, and refuelling turnaround time to improve the model and make it more useful for operational planning.
14. Conclusion
This study shows how operational site data can support better diesel planning and risk prioritisation in telecom infrastructure operations. Using 118 site observations, the analysis found that diesel level, CPD, and grid availability are important explanatory variables for the available-days KPI used in diesel sustainability monitoring.
The results confirm that higher diesel levels are associated with longer diesel sustainability, while higher CPD reduces the number of available days because the site consumes diesel faster. The analysis also shows that grid availability has a statistically significant effect on diesel sustainability. The regression model explains about 72.5% of the variation in available days, which means the selected variables provide useful operational insight.
For regional technical operations, the findings support a more proactive and risk-based approach to diesel management. Instead of waiting for sites to approach dry dates, operations teams can use diesel level, CPD, grid status, and available days to identify high-risk sites early and intervene before service availability is affected.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
Idegwu, V. (2026). Diesel sustainability spot-check data across selected telecom network sites [Dataset]. Collected from regional technical operations spot-check records, IHS Towers, Nigeria. Data available on request from the author.
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., & Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.
Robinson, D., Hayes, A., & Couch, S. (2024). broom: Convert statistical objects into tidy tibbles [R package].
AI Usage Statement
AI tools were used to support code structuring, interpretation guidance, and report organisation. The dataset, business context, analytical judgement, and final interpretation were based on my professional understanding of telecom operations and diesel sustainability monitoring as a Manager, Regional Technical Operations.