| Site_ID | Grid_Availability | Diesel_Level | Spotcheck_Date | CPD | Available_Days |
|---|---|---|---|---|---|
| Site_001 | 0 | 325 | 2026-05-11 | 60 | 5.416667 |
| Site_002 | 0 | 200 | 2026-05-11 | 50 | 4.000000 |
| Site_003 | 0 | 222 | 2026-05-11 | 55 | 4.036364 |
| Site_004 | 0 | 210 | 2026-05-11 | 50 | 4.200000 |
| Site_005 | 0 | 100 | 2026-05-11 | 65 | 1.538461 |
| Site_006 | 0 | 201 | 2026-05-11 | 50 | 4.020000 |
Diesel Sustainability Analysis Across Telecom Network Sites
Executive Summary
This study examines diesel sustainability across selected telecom network sites using anonymised operational spot-check data from a telecom infrastructure operations context. As a Manager, Regional Technical Operations, I work closely with site performance, diesel monitoring, grid availability, vendor follow-up, and operational stability. These areas are important because many telecom sites still depend on diesel generators when grid power is unavailable, unstable, or insufficient. If diesel levels and consumption patterns are not properly monitored, sites can move quickly towards dry-date positions, creating avoidable service risks.
The dataset used for this analysis contains 118 site observations. It includes site ID, grid availability, diesel level at spot-check, date of spot-check, field consumption per day (CPD), and available days. Available days was used as the main outcome variable because it is the operational KPI used by the team to estimate how long each site can continue operating before reaching a critical diesel position.
The study applied exploratory data analysis, visualisation, hypothesis testing, correlation analysis, and regression analysis. The results show that diesel level has a positive relationship with available days, while CPD has a negative relationship with available days. The analysis also shows that grid availability makes a statistically significant difference in available days.
The regression model explains about 72.5% of the variation in available days. Diesel level, grid availability, and CPD were all statistically significant explanatory variables for the available-days KPI. The findings support a more risk-based approach to diesel planning, where sites are prioritised using available days, CPD, diesel level, and grid status together rather than diesel level alone.
1. Introduction
Telecom network sites require stable power to remain available to customers and mobile network operators. In practice, grid power is not always reliable across all locations. Some sites have grid supply, while others depend more heavily on diesel generators. This makes diesel sustainability an important operational concern.
In regional technical operations, one of the key questions is not only whether a site has diesel, but how long that diesel can realistically sustain the site. A site with a moderate diesel level may still be at risk if its daily consumption is high. In the same way, a site with grid availability may preserve diesel for longer than a site operating mainly on generator supply.
This study focuses on the relationship between diesel level, grid availability, CPD, and available days across selected telecom network sites. The aim is to use operational data to identify the factors that influence diesel sustainability and to support better site prioritisation.
The business question guiding the analysis is:
How do diesel level, grid availability, and consumption per day affect available days across telecom network sites?
2. Professional Disclosure
I work as a Manager, Regional Technical Operations in the telecommunications infrastructure sector. My role involves monitoring site performance, diesel sustainability, grid availability, vendor response, and general operational stability across network sites.
Diesel monitoring is directly linked to my day-to-day work because a site with poor diesel sustainability can quickly become an operational risk. If such sites are not identified early, they may require emergency refuelling, escalation, or urgent vendor intervention. This can affect network availability and increase operational pressure on the regional team.
The five techniques used in this study are relevant to this operational context. Exploratory Data Analysis helps me understand the current condition of the dataset and identify unusual values or gaps. Visualisation helps present the pattern of diesel level, CPD, grid availability, and available days in a way that is easy for operations teams to understand. Hypothesis testing helps check whether observed differences or relationships are statistically meaningful. Correlation analysis helps show the strength and direction of relationships among key variables. Regression analysis helps estimate how diesel level, grid availability, and CPD jointly explain the available-days KPI.
These techniques are also aligned with the course textbook’s focus on using data analytics methods to support business decision-making. In this case, the analysis supports practical decisions around diesel planning, site prioritisation, and proactive intervention.
3. Data Collection and Sampling
The dataset was collected from operational diesel spot-check records across selected telecom network sites. The records contain site-level information on diesel level, grid availability, field CPD, date of spot-check, and available days.
The sampling frame consists of telecom sites under regional technical operations monitoring. Each observation represents one site spot-check record. The dataset contains 118 observations, which satisfies the minimum requirement of at least 100 observations for this assessment.
The variables used in this study are:
- Site_ID: Unique anonymised site identifier.
- Grid_Availability: Whether grid power was available at the site.
- Diesel_Level: Diesel volume recorded during the spot-check.
- Spotcheck_Date: Date the site diesel position was captured.
- CPD: Field consumption per day.
- Available_Days: Estimated number of days the available diesel can sustain the site. This was obtained from operational diesel monitoring records and is based on diesel level and field CPD.
Available days is used as an operational sustainability KPI because it translates diesel level and consumption rate into a practical planning indicator. It helps identify which sites are closest to dry-date risk and should be prioritised for intervention.
For confidentiality, actual site identifiers were anonymised using generic labels such as Site_001, Site_002, and Site_003. This anonymisation only affects site names and does not change the numerical data used for the statistical tests, visualisations, or regression model.
4. Data Loading and Preparation
The dataset was loaded into R and prepared for analysis. The preparation stage included renaming columns, converting grid availability into a binary variable, formatting the spot-check date, and anonymising the site identifiers before displaying or plotting the data.
5. Data Description
The dataset contains both numerical and categorical variables. Diesel level, CPD, and available days are numerical variables. Grid availability was converted into a binary numeric variable for analysis, where 1 represents sites with grid availability and 0 represents sites without grid availability.
| Variable | Type | Description |
|---|---|---|
| SN | Numeric | Serial number for each observation |
| Site_ID | Character | Anonymised unique site identifier |
| Grid_Availability | Numeric/Binary | Binary grid status: 1 = grid available, 0 = no grid |
| Diesel_Level | Numeric | Diesel volume recorded during spot-check |
| Spotcheck_Date | Date | Date of diesel spot-check |
| CPD | Numeric | Consumption per day |
| Available_Days | Numeric | Estimated number of days diesel can sustain the site |
| Metric | Value |
|---|---|
| Number of observations | 118.00 |
| Average diesel level | 486.28 |
| Minimum diesel level | 30.00 |
| Maximum diesel level | 2382.00 |
| Average CPD | 41.19 |
| Average available days | 14.00 |
| Minimum available days | 0.60 |
| Maximum available days | 43.00 |
The summary statistics show that the dataset contains 118 site observations. The average diesel level is approximately 486.3 litres, the average CPD is approximately 41.19, and the average available days is approximately 14 days. The dataset also shows clear variation across sites. Some sites have very low available days, while others have much higher diesel sustainability.
From an operational point of view, this variation is important because it means that sites should not be treated equally during diesel planning. Sites with short available days require faster attention than sites with longer sustainability.
6. Exploratory Data Analysis
Exploratory Data Analysis was carried out to understand the dataset before applying more formal statistical methods. The focus was on checking the structure of the data, identifying missing values, reviewing possible outliers, and understanding the basic distribution of diesel level, CPD, and available days.
| Variable | Missing_Values |
|---|---|
| SN | 0 |
| Site_ID | 0 |
| Grid_Availability | 0 |
| Diesel_Level | 0 |
| Spotcheck_Date | 0 |
| CPD | 0 |
| Available_Days | 0 |
The missing value check shows that there are no missing values in the dataset. This means all 118 site observations are complete for the selected variables and can be used for the analysis.
| Variable | Statistic | Value |
|---|---|---|
| Diesel_Level | Minimum | 30.00 |
| Diesel_Level | Mean | 486.28 |
| Diesel_Level | Median | 300.00 |
| Diesel_Level | Maximum | 2382.00 |
| CPD | Minimum | 10.00 |
| CPD | Mean | 41.19 |
| CPD | Median | 40.00 |
| CPD | Maximum | 138.00 |
| Available_Days | Minimum | 0.60 |
| Available_Days | Mean | 14.00 |
| Available_Days | Median | 10.43 |
| Available_Days | Maximum | 43.00 |
Two data quality issues were considered during the EDA stage. First, missing values were checked across all variables, and the result showed no missing values. Therefore, no observation had to be removed due to incomplete data. Second, possible outliers were reviewed in diesel level, CPD, and available days. Some sites had very high diesel levels and available days compared with others, but these values were retained because they reflect realistic operational differences in site configuration, diesel holding capacity, grid condition, and consumption pattern.
This stage helped confirm that the dataset was suitable for analysis. It also showed that available days varies meaningfully across sites, which makes it a useful outcome variable for diesel sustainability analysis.
7. Data Visualisation
Visualisation was used to communicate the main patterns in the data. The charts focus on the relationships between diesel level, CPD, grid availability, and available days. These visualisations are useful because operations managers often need quick and clear insights for site prioritisation.
7.1 Diesel Level and Available Days
Figure 1: Diesel Level vs Available Days
The chart shows a positive relationship between diesel level and available days. This means that sites with higher diesel levels generally have more available days. This is expected operationally because diesel level is a key input in the available-days KPI.
7.2 CPD and Available Days
Figure 2: CPD vs Available Days
This chart shows the relationship between CPD and available days. Sites with higher CPD consume diesel faster, which can reduce available days. This is important because diesel volume alone may not give the full picture. A site may appear safe based on diesel level, but if its CPD is high, it may still require early intervention.
7.3 Grid Availability and Available Days
Figure 3: Grid Availability vs Available Days
This chart compares available days between sites with and without grid availability. Sites with grid availability are expected to preserve diesel better because they do not depend on generator power all the time. This makes grid availability an important factor in diesel sustainability planning.
7.4 Distribution of Available Days
Figure 4: Distribution of Available Days
The histogram shows how available days are distributed across the selected sites. This helps identify whether most sites have reasonable diesel sustainability or whether many sites are clustered around low available days. For operations, sites with low available days are more urgent because they may approach dry-date positions faster.
7.5 Top 10 Sites with Lowest Available Days
Figure 5: Top 10 Sites with Lowest Available Days
This chart identifies the ten sites with the lowest available days. These sites represent the most immediate diesel sustainability risks in the dataset. In a real operational setting, they should be prioritised for review, escalation, or diesel intervention.
8. Hypothesis Testing
Two hypothesis tests were conducted. The first test examined whether available days differ significantly between sites with grid availability and sites without grid availability. The second test examined whether diesel level has a statistically significant relationship with available days.
8.1 Hypothesis Test 1: Grid Availability and Available Days
The hypotheses are:
- Null hypothesis (H0): There is no significant difference in available days between sites with grid availability and sites without grid availability.
- Alternative hypothesis (H1): There is a significant difference in available days between sites with grid availability and sites without grid availability.
| Test | Mean_No_Grid | Mean_With_Grid | Test_Statistic | P_Value | Decision |
|---|---|---|---|---|---|
| Independent samples t-test | 11.91 | 16.08 | -2.099 | 0.038 | Reject H0 |
The hypothesis test produced a p-value of 0.038. Since this is below the 0.05 significance level, the null hypothesis is rejected. This means there is a statistically significant difference in available days between sites with grid availability and sites without grid availability.
The result also shows that sites without grid availability had an average of about 11.91 available days, while sites with grid availability had an average of about 16.08 available days. In business terms, this suggests that grid availability should be considered when prioritising sites for diesel intervention. Sites without grid availability are more exposed because they depend more heavily on diesel generation.
8.2 Hypothesis Test 2: Diesel Level and Available Days
The hypotheses are:
- Null hypothesis (H0): There is no significant relationship between diesel level and available days.
- Alternative hypothesis (H1): There is a significant relationship between diesel level and available days.
| Test | Correlation | Test_Statistic | P_Value | Decision |
|---|---|---|---|---|
| Pearson correlation test | 0.638 | 8.924 | <0.001 | Reject H0 |
The correlation test produced a p-value of 7.832e-15, which is far below the 0.05 significance level. Therefore, the null hypothesis is rejected. This means there is a statistically significant positive relationship between diesel level and available days. The correlation coefficient is 0.638, indicating a moderately strong positive relationship.
Operationally, this confirms that diesel level is a meaningful input in the available-days KPI. However, it should not be used alone because CPD and grid availability also influence how long diesel will last.
9. Correlation Analysis
Correlation analysis was used to examine the strength and direction of the relationship between diesel level, CPD, and available days.
| Diesel_Level | CPD | Available_Days | |
|---|---|---|---|
| Diesel_Level | 1.000 | 0.339 | 0.638 |
| CPD | 0.339 | 1.000 | -0.292 |
| Available_Days | 0.638 | -0.292 | 1.000 |
Figure 6: Correlation plot
The correlation analysis shows a positive correlation of 0.638 between diesel level and available days. This is expected because available days is an operational KPI based partly on diesel level. Higher diesel levels generally increase the number of days a site can continue operating.
The analysis also shows a negative correlation of -0.292 between CPD and available days. This is also operationally sensible because higher daily consumption reduces how long available diesel can sustain a site.
Although available days is based on diesel level and CPD, the correlation analysis remains useful because it shows how strongly these operational variables move together across the selected sites. It also reinforces the need to consider CPD alongside diesel level during diesel planning.
The relationship between diesel level, CPD, and available days is the most operationally direct relationship in the dataset because available days is based on diesel level and CPD. However, correlation alone should not be treated as full proof of causality. To confirm the operational effect more strongly, I would track the same sites over time before and after diesel refill or grid-restoration events. This would show whether changes in diesel level, CPD, or grid availability lead to measurable changes in available days and site risk.
10. Regression Analysis
Regression analysis was used to examine the combined relationship between diesel level, grid availability, CPD, and the available-days KPI.
The regression model is:
Available_Days KPI = f(Diesel_Level, Grid_Availability, CPD)
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 5.6766 | 2.8376 | 2.001 | 0.0478 |
| Diesel_Level | 0.0197 | 0.0012 | 16.082 | <0.001 |
| Grid_Availability | 7.2859 | 2.2665 | 3.215 | 0.0017 |
| CPD | -0.1188 | 0.0379 | -3.134 | 0.0022 |
| Metric | Value |
|---|---|
| R-squared | 0.7247 |
| Adjusted R-squared | 0.7175 |
| Residual standard error | 5.8251 |
| F-statistic | 100.055 |
| Model p-value | <0.001 |
| Predictor degrees of freedom | 3 |
| Residual degrees of freedom | 114 |
| Number of observations | 118 |
The regression result shows that diesel level, grid availability, and CPD are all statistically significant explanatory variables for the available-days KPI. Diesel level has a positive coefficient of 0.0197, meaning that higher diesel volume is associated with higher available days. CPD has a negative coefficient of -0.1188, meaning that higher daily consumption reduces available days. Grid availability has a positive coefficient of 7.2859, suggesting that sites with grid availability tend to have higher available days than sites without grid availability, holding other variables constant.
The regression model should be interpreted as an explanatory model of the available-days KPI, not as proof of independent causation. Since Available_Days is an operational KPI derived from diesel level and CPD, the model helps show how these inputs relate to the sustainability indicator across sites. Grid availability is included to provide additional operational context because grid-powered sites may depend less on diesel generation.
The model has an R-squared value of 0.7247, meaning that diesel level, grid availability, and CPD together explain about 72.5% of the variation in available days across the sites. This is useful for operational planning because it confirms that diesel sustainability should be assessed using a combination of diesel level, CPD, grid status, and available days.
10.1 Regression Diagnostic Plots
Figure 7: Regression diagnostic plots
The diagnostic plots were reviewed to assess whether the regression assumptions were reasonable. These plots help check linearity, residual spread, normality of residuals, and possible influential observations. Although some variation is expected in operational data, the model remains useful for explaining the main drivers of available days.
11. Integrated Findings
The analysis provides consistent evidence that diesel sustainability is mainly influenced by diesel level, CPD, and grid availability. The descriptive statistics show that the dataset contains 118 complete site observations, making it suitable for the selected analysis. The average available days across the dataset is about 14 days, but some sites have much lower available days and therefore require closer attention.
The visualisations show that diesel level and available days move in the same direction, while CPD and available days move in opposite directions. This means that higher diesel volume improves site sustainability, while higher consumption reduces it. The grid availability comparison also shows that sites with grid availability tend to have higher average available days than sites without grid availability.
The hypothesis tests strengthen the analysis. The first test shows that grid availability makes a statistically significant difference in available days. The second test shows that diesel level has a statistically significant positive relationship with the available-days KPI. Since available days is based on diesel level and CPD, this result is operationally expected, but it still confirms that diesel level is a meaningful input in site sustainability planning.
The regression result further shows that diesel level, grid availability, and CPD are all statistically significant explanatory variables for available days. This supports the operational view that diesel sustainability should be assessed using a combination of current diesel level, daily consumption, grid condition, and available days.
Overall, the findings support a risk-based approach. Sites should not be prioritised based only on diesel level. A more reliable approach is to consider diesel level, CPD, grid availability, and available days together when deciding which sites require urgent intervention.
12. Recommendations
Based on the analysis, the following recommendations are made:
Sites with low available days should be prioritised for diesel intervention because they are closer to critical diesel levels.
Sites with high CPD should be monitored more frequently because they consume diesel faster and can reach dry-date positions quickly.
Sites without grid availability should receive special attention because they are more dependent on diesel generators and recorded lower average available days.
Diesel planning should combine diesel level, CPD, grid availability, and available days rather than relying only on current diesel volume.
A simple risk-ranking dashboard should be developed to flag sites with low diesel level, high CPD, no grid availability, and short available days.
Regional operations teams should use available days as a daily planning indicator for diesel allocation, escalation, and vendor follow-up.
The top ten sites with the lowest available days should be reviewed daily until their diesel position improves, because they represent the most immediate operational risk.
13. Limitations and Further Work
The study is based on spot-check data collected over a limited period. Although the dataset is useful for understanding diesel sustainability, it does not include all possible operational factors that may affect available days.
For example, the analysis does not include generator condition, site load, access challenges, security issues, refuelling delays, vendor response time, or actual outage hours. These factors may also influence how quickly a site becomes operationally exposed.
Future analysis can include more time periods and additional operational variables. A larger dataset covering several weeks or months would support stronger trend analysis and forecasting. Future work can also include outage hours, generator runtime, site load, and refuelling turnaround time to improve the model and make it more useful for operational planning.
14. Conclusion
This study shows how operational site data can support better diesel planning and risk prioritisation in telecom infrastructure operations. Using 118 site observations, the analysis found that diesel level, CPD, and grid availability are important explanatory variables for the available-days KPI used in diesel sustainability monitoring.
The results confirm that higher diesel levels are associated with longer diesel sustainability, while higher CPD reduces the number of available days because the site consumes diesel faster. The analysis also shows that grid availability has a statistically significant effect on diesel sustainability. The regression model explains about 72.5% of the variation in available days, which means the selected variables provide useful operational insight.
For regional technical operations, the findings support a more proactive and risk-based approach to diesel management. Instead of waiting for sites to approach dry dates, operations teams can use diesel level, CPD, grid status, and available days to identify high-risk sites early and intervene before service availability is affected.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
Idegwu, V. (2026). Diesel sustainability spot-check data across selected telecom network sites [Dataset]. Anonymised operational dataset prepared for academic analysis. Data available on request from the author.
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., & Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.
Robinson, D., Hayes, A., & Couch, S. (2024). broom: Convert statistical objects into tidy tibbles [R package].
AI Usage Statement
AI tools were used to support code structuring, interpretation guidance, report organisation, and anonymisation of site identifiers. The dataset, business context, analytical judgement, and final interpretation were based on my professional understanding of telecom operations and diesel sustainability monitoring as a Manager, Regional Technical Operations.